* [PATCH] cfq-iosched: non-rot devices do not need queue merging @ 2009-12-30 12:10 Corrado Zoccolo 2009-12-30 18:45 ` Jens Axboe 0 siblings, 1 reply; 41+ messages in thread From: Corrado Zoccolo @ 2009-12-30 12:10 UTC (permalink / raw) To: Jens Axboe Cc: Linux-Kernel, Jeff Moyer, Vivek Goyal, Shaohua Li, Gui Jianfeng, Corrado Zoccolo Non rotational devices' performances are not affected by distance of requests, so there is no point in having overhead to merge queues of nearby requests. Signed-off-by: Corrado Zoccolo <czoccolo@gmail.com> --- block/cfq-iosched.c | 5 +++-- 1 files changed, 3 insertions(+), 2 deletions(-) diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c index 918c7fd..01bb0f3 100644 --- a/block/cfq-iosched.c +++ b/block/cfq-iosched.c @@ -1268,7 +1268,8 @@ static void cfq_prio_tree_add(struct cfq_data *cfqd, struct cfq_queue *cfqq) return; if (!cfqq->next_rq) return; - + if (blk_queue_nonrot(cfqd->queue)) + return; cfqq->p_root = &cfqd->prio_trees[cfqq->org_ioprio]; __cfqq = cfq_prio_tree_lookup(cfqd, cfqq->p_root, blk_rq_pos(cfqq->next_rq), &parent, &p); @@ -1689,7 +1690,7 @@ static struct cfq_queue *cfqq_close(struct cfq_data *cfqd, struct cfq_queue *__cfqq; sector_t sector = cfqd->last_position; - if (RB_EMPTY_ROOT(root)) + if (RB_EMPTY_ROOT(root) || blk_queue_nonrot(cfqd->queue)) return NULL; /* -- 1.6.4.4 ^ permalink raw reply related [flat|nested] 41+ messages in thread
* Re: [PATCH] cfq-iosched: non-rot devices do not need queue merging 2009-12-30 12:10 [PATCH] cfq-iosched: non-rot devices do not need queue merging Corrado Zoccolo @ 2009-12-30 18:45 ` Jens Axboe 2009-12-30 20:31 ` Corrado Zoccolo 0 siblings, 1 reply; 41+ messages in thread From: Jens Axboe @ 2009-12-30 18:45 UTC (permalink / raw) To: Corrado Zoccolo Cc: Linux-Kernel, Jeff Moyer, Vivek Goyal, Shaohua Li, Gui Jianfeng On Wed, Dec 30 2009, Corrado Zoccolo wrote: > Non rotational devices' performances are not affected by > distance of requests, so there is no point in having overhead > to merge queues of nearby requests. If the distance is zero, it may still make a big difference (at least for writes). This check would be better as "ncq and doesn't suck", ala blk_queue_nonrot(q) && tagged like we do elsewhere. -- Jens Axboe ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] cfq-iosched: non-rot devices do not need queue merging 2009-12-30 18:45 ` Jens Axboe @ 2009-12-30 20:31 ` Corrado Zoccolo 2009-12-30 21:11 ` Jens Axboe 0 siblings, 1 reply; 41+ messages in thread From: Corrado Zoccolo @ 2009-12-30 20:31 UTC (permalink / raw) To: Jens Axboe Cc: Linux-Kernel, Jeff Moyer, Vivek Goyal, Shaohua Li, Gui Jianfeng On Wed, Dec 30, 2009 at 7:45 PM, Jens Axboe <jens.axboe@oracle.com> wrote: > On Wed, Dec 30 2009, Corrado Zoccolo wrote: >> Non rotational devices' performances are not affected by >> distance of requests, so there is no point in having overhead >> to merge queues of nearby requests. > > If the distance is zero, it may still make a big difference (at least > for writes). This check would be better as "ncq and doesn't suck", ala > > blk_queue_nonrot(q) && tagged > > like we do elsewhere. For reads, though, even flash cards and netbook ssds are completely unaffected. I have done few experiments on my available disks: * http://dl.dropbox.com/u/3525644/service_time.png (I used the program: http://dl.dropbox.com/u/3525644/stride.c to get the graphs). For distance 0, I think request merging will be more effective than queue merging, moreover I think the multi-thread trick to have large I/O depth is used for reads, not writes (where simply issuing buffered writes already achieves a similar effect), so I think it is safe to disable it for all non-rotational devices. Thanks, Corrado > > -- > Jens Axboe > > ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] cfq-iosched: non-rot devices do not need queue merging 2009-12-30 20:31 ` Corrado Zoccolo @ 2009-12-30 21:11 ` Jens Axboe 2009-12-30 21:21 ` Corrado Zoccolo 0 siblings, 1 reply; 41+ messages in thread From: Jens Axboe @ 2009-12-30 21:11 UTC (permalink / raw) To: Corrado Zoccolo Cc: Linux-Kernel, Jeff Moyer, Vivek Goyal, Shaohua Li, Gui Jianfeng On Wed, Dec 30 2009, Corrado Zoccolo wrote: > On Wed, Dec 30, 2009 at 7:45 PM, Jens Axboe <jens.axboe@oracle.com> wrote: > > On Wed, Dec 30 2009, Corrado Zoccolo wrote: > >> Non rotational devices' performances are not affected by > >> distance of requests, so there is no point in having overhead > >> to merge queues of nearby requests. > > > > If the distance is zero, it may still make a big difference (at least > > for writes). This check would be better as "ncq and doesn't suck", ala > > > > blk_queue_nonrot(q) && tagged > > > > like we do elsewhere. > > For reads, though, even flash cards and netbook ssds are completely > unaffected. I have done few experiments on my available disks: > * http://dl.dropbox.com/u/3525644/service_time.png (I used the > program: http://dl.dropbox.com/u/3525644/stride.c to get the graphs). Completely agree, it's writes that matter (as mentioned). > For distance 0, I think request merging will be more effective than > queue merging, moreover I think the multi-thread trick to have large Definitely true, but we don't allow cross cfqq merges to begin with. > I/O depth is used for reads, not writes (where simply issuing buffered > writes already achieves a similar effect), so I think it is safe to > disable it for all non-rotational devices. That still leaves direct writes. Granted it's a problem with a huge scope, but still. -- Jens Axboe ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] cfq-iosched: non-rot devices do not need queue merging 2009-12-30 21:11 ` Jens Axboe @ 2009-12-30 21:21 ` Corrado Zoccolo 2009-12-30 21:34 ` Jens Axboe 0 siblings, 1 reply; 41+ messages in thread From: Corrado Zoccolo @ 2009-12-30 21:21 UTC (permalink / raw) To: Jens Axboe Cc: Linux-Kernel, Jeff Moyer, Vivek Goyal, Shaohua Li, Gui Jianfeng On Wed, Dec 30, 2009 at 10:11 PM, Jens Axboe <jens.axboe@oracle.com> wrote: > On Wed, Dec 30 2009, Corrado Zoccolo wrote: >> On Wed, Dec 30, 2009 at 7:45 PM, Jens Axboe <jens.axboe@oracle.com> wrote: >> > On Wed, Dec 30 2009, Corrado Zoccolo wrote: >> >> Non rotational devices' performances are not affected by >> >> distance of requests, so there is no point in having overhead >> >> to merge queues of nearby requests. >> > >> > If the distance is zero, it may still make a big difference (at least >> > for writes). This check would be better as "ncq and doesn't suck", ala >> > >> > blk_queue_nonrot(q) && tagged >> > >> > like we do elsewhere. >> >> For reads, though, even flash cards and netbook ssds are completely >> unaffected. I have done few experiments on my available disks: >> * http://dl.dropbox.com/u/3525644/service_time.png (I used the >> program: http://dl.dropbox.com/u/3525644/stride.c to get the graphs). > > Completely agree, it's writes that matter (as mentioned). > >> For distance 0, I think request merging will be more effective than >> queue merging, moreover I think the multi-thread trick to have large > > Definitely true, but we don't allow cross cfqq merges to begin with. > >> I/O depth is used for reads, not writes (where simply issuing buffered >> writes already achieves a similar effect), so I think it is safe to >> disable it for all non-rotational devices. > > That still leaves direct writes. Granted it's a problem with a huge > scope, but still. Maybe I can mark sync queues that have write requests, and only add those ones to the prio tree. For writes, merging queues (and therefore requests) can probably help even the smart ssds. Thanks, Corrado > > -- > Jens Axboe > > -- __________________________________________________________________________ dott. Corrado Zoccolo mailto:czoccolo@gmail.com PhD - Department of Computer Science - University of Pisa, Italy -------------------------------------------------------------------------- The self-confidence of a warrior is not the self-confidence of the average man. The average man seeks certainty in the eyes of the onlooker and calls that self-confidence. The warrior seeks impeccability in his own eyes and calls that humbleness. Tales of Power - C. Castaneda ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] cfq-iosched: non-rot devices do not need queue merging 2009-12-30 21:21 ` Corrado Zoccolo @ 2009-12-30 21:34 ` Jens Axboe 2009-12-30 22:22 ` [PATCH] cfq-iosched: non-rot devices do not need read " Corrado Zoccolo 0 siblings, 1 reply; 41+ messages in thread From: Jens Axboe @ 2009-12-30 21:34 UTC (permalink / raw) To: Corrado Zoccolo Cc: Linux-Kernel, Jeff Moyer, Vivek Goyal, Shaohua Li, Gui Jianfeng On Wed, Dec 30 2009, Corrado Zoccolo wrote: > On Wed, Dec 30, 2009 at 10:11 PM, Jens Axboe <jens.axboe@oracle.com> wrote: > > On Wed, Dec 30 2009, Corrado Zoccolo wrote: > >> On Wed, Dec 30, 2009 at 7:45 PM, Jens Axboe <jens.axboe@oracle.com> wrote: > >> > On Wed, Dec 30 2009, Corrado Zoccolo wrote: > >> >> Non rotational devices' performances are not affected by > >> >> distance of requests, so there is no point in having overhead > >> >> to merge queues of nearby requests. > >> > > >> > If the distance is zero, it may still make a big difference (at least > >> > for writes). This check would be better as "ncq and doesn't suck", ala > >> > > >> > blk_queue_nonrot(q) && tagged > >> > > >> > like we do elsewhere. > >> > >> For reads, though, even flash cards and netbook ssds are completely > >> unaffected. I have done few experiments on my available disks: > >> * http://dl.dropbox.com/u/3525644/service_time.png (I used the > >> program: http://dl.dropbox.com/u/3525644/stride.c to get the graphs). > > > > Completely agree, it's writes that matter (as mentioned). > > > >> For distance 0, I think request merging will be more effective than > >> queue merging, moreover I think the multi-thread trick to have large > > > > Definitely true, but we don't allow cross cfqq merges to begin with. > > > >> I/O depth is used for reads, not writes (where simply issuing buffered > >> writes already achieves a similar effect), so I think it is safe to > >> disable it for all non-rotational devices. > > > > That still leaves direct writes. Granted it's a problem with a huge > > scope, but still. > Maybe I can mark sync queues that have write requests, and only add > those ones to the prio tree. That sounds like a solution and avoids the merge/breakup pain for (by far) most use cases. > For writes, merging queues (and therefore requests) can probably help > even the smart ssds. Yes, but we are getting to the point of having to be more careful about CPU cycles on SSDs. But lets do it, I'll be spending a good chunk of time on that very shortly anyway. -- Jens Axboe ^ permalink raw reply [flat|nested] 41+ messages in thread
* [PATCH] cfq-iosched: non-rot devices do not need read queue merging 2009-12-30 21:34 ` Jens Axboe @ 2009-12-30 22:22 ` Corrado Zoccolo 2010-01-04 14:47 ` Vivek Goyal 2010-01-10 21:04 ` [PATCH] cfq-iosched: NCQ SSDs " Corrado Zoccolo 0 siblings, 2 replies; 41+ messages in thread From: Corrado Zoccolo @ 2009-12-30 22:22 UTC (permalink / raw) To: Jens Axboe Cc: Linux-Kernel, Jeff Moyer, Vivek Goyal, Shaohua Li, Gui Jianfeng, Corrado Zoccolo Non rotational devices' performances are not affected by distance of read requests, so there is no point in having overhead to merge such queues. This doesn't apply to writes, so this patch changes the queued[] field, to be indexed by READ/WRITE instead of SYNC/ASYNC, and only compute proximity for queues with WRITE requests. Signed-off-by: Corrado Zoccolo <czoccolo@gmail.com> --- block/cfq-iosched.c | 20 +++++++++++--------- 1 files changed, 11 insertions(+), 9 deletions(-) diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c index 918c7fd..7da9391 100644 --- a/block/cfq-iosched.c +++ b/block/cfq-iosched.c @@ -108,9 +108,9 @@ struct cfq_queue { struct rb_root sort_list; /* if fifo isn't expired, next request to serve */ struct request *next_rq; - /* requests queued in sort_list */ + /* requests queued in sort_list, indexed by READ/WRITE */ int queued[2]; - /* currently allocated requests */ + /* currently allocated requests, indexed by READ/WRITE */ int allocated[2]; /* fifo list of requests in sort_list */ struct list_head fifo; @@ -1268,7 +1268,8 @@ static void cfq_prio_tree_add(struct cfq_data *cfqd, struct cfq_queue *cfqq) return; if (!cfqq->next_rq) return; - + if (blk_queue_nonrot(cfqd->queue) && !cfqq->queued[WRITE]) + return; cfqq->p_root = &cfqd->prio_trees[cfqq->org_ioprio]; __cfqq = cfq_prio_tree_lookup(cfqd, cfqq->p_root, blk_rq_pos(cfqq->next_rq), &parent, &p); @@ -1337,10 +1338,10 @@ static void cfq_del_cfqq_rr(struct cfq_data *cfqd, struct cfq_queue *cfqq) static void cfq_del_rq_rb(struct request *rq) { struct cfq_queue *cfqq = RQ_CFQQ(rq); - const int sync = rq_is_sync(rq); + const int rw = rq_data_dir(rq); - BUG_ON(!cfqq->queued[sync]); - cfqq->queued[sync]--; + BUG_ON(!cfqq->queued[rw]); + cfqq->queued[rw]--; elv_rb_del(&cfqq->sort_list, rq); @@ -1363,7 +1364,7 @@ static void cfq_add_rq_rb(struct request *rq) struct cfq_data *cfqd = cfqq->cfqd; struct request *__alias, *prev; - cfqq->queued[rq_is_sync(rq)]++; + cfqq->queued[rq_data_dir(rq)]++; /* * looks a little odd, but the first insert might return an alias. @@ -1393,7 +1394,7 @@ static void cfq_add_rq_rb(struct request *rq) static void cfq_reposition_rq_rb(struct cfq_queue *cfqq, struct request *rq) { elv_rb_del(&cfqq->sort_list, rq); - cfqq->queued[rq_is_sync(rq)]--; + cfqq->queued[rq_data_dir(rq)]--; cfq_add_rq_rb(rq); } @@ -1689,7 +1690,8 @@ static struct cfq_queue *cfqq_close(struct cfq_data *cfqd, struct cfq_queue *__cfqq; sector_t sector = cfqd->last_position; - if (RB_EMPTY_ROOT(root)) + if (RB_EMPTY_ROOT(root) || + (blk_queue_nonrot(cfqd->queue) && !cur_cfqq->queued[WRITE])) return NULL; /* -- 1.6.4.4 ^ permalink raw reply related [flat|nested] 41+ messages in thread
* Re: [PATCH] cfq-iosched: non-rot devices do not need read queue merging 2009-12-30 22:22 ` [PATCH] cfq-iosched: non-rot devices do not need read " Corrado Zoccolo @ 2010-01-04 14:47 ` Vivek Goyal 2010-01-04 16:36 ` Corrado Zoccolo 2010-01-10 21:04 ` [PATCH] cfq-iosched: NCQ SSDs " Corrado Zoccolo 1 sibling, 1 reply; 41+ messages in thread From: Vivek Goyal @ 2010-01-04 14:47 UTC (permalink / raw) To: Corrado Zoccolo Cc: Jens Axboe, Linux-Kernel, Jeff Moyer, Shaohua Li, Gui Jianfeng On Wed, Dec 30, 2009 at 11:22:47PM +0100, Corrado Zoccolo wrote: > Non rotational devices' performances are not affected by > distance of read requests, so there is no point in having > overhead to merge such queues. > This doesn't apply to writes, so this patch changes the > queued[] field, to be indexed by READ/WRITE instead of > SYNC/ASYNC, and only compute proximity for queues with > WRITE requests. > Hi Corrado, What's the reason that reads don't benefit from merging queues and hence merging requests and only writes do on SSD? > Signed-off-by: Corrado Zoccolo <czoccolo@gmail.com> > --- > block/cfq-iosched.c | 20 +++++++++++--------- > 1 files changed, 11 insertions(+), 9 deletions(-) > > diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c > index 918c7fd..7da9391 100644 > --- a/block/cfq-iosched.c > +++ b/block/cfq-iosched.c > @@ -108,9 +108,9 @@ struct cfq_queue { > struct rb_root sort_list; > /* if fifo isn't expired, next request to serve */ > struct request *next_rq; > - /* requests queued in sort_list */ > + /* requests queued in sort_list, indexed by READ/WRITE */ > int queued[2]; > - /* currently allocated requests */ > + /* currently allocated requests, indexed by READ/WRITE */ > int allocated[2]; Sometime back Jens had changed all READ/WRITE indexing to SYNC/ASYNC indexing throughout IO schedulers and block layer. Personally I would prefer to keep it that way and not have a mix of SYNC/ASYNC and READ/WRITE indexing in code. What are we gaining by this patch? Save some cpu cycles by not merging and splitting the read cfqq on ssd? Do you have any numbers how much is the saving. My knee jerk reaction is that if gains are not significant, lets not do this optimization and let the code be simple. > /* fifo list of requests in sort_list */ > struct list_head fifo; > @@ -1268,7 +1268,8 @@ static void cfq_prio_tree_add(struct cfq_data *cfqd, struct cfq_queue *cfqq) > return; > if (!cfqq->next_rq) > return; > - > + if (blk_queue_nonrot(cfqd->queue) && !cfqq->queued[WRITE]) > + return; A 1-2 line comment here will help about why writes still benefit and not reads. > cfqq->p_root = &cfqd->prio_trees[cfqq->org_ioprio]; > __cfqq = cfq_prio_tree_lookup(cfqd, cfqq->p_root, > blk_rq_pos(cfqq->next_rq), &parent, &p); > @@ -1337,10 +1338,10 @@ static void cfq_del_cfqq_rr(struct cfq_data *cfqd, struct cfq_queue *cfqq) > static void cfq_del_rq_rb(struct request *rq) > { > struct cfq_queue *cfqq = RQ_CFQQ(rq); > - const int sync = rq_is_sync(rq); > + const int rw = rq_data_dir(rq); > > - BUG_ON(!cfqq->queued[sync]); > - cfqq->queued[sync]--; > + BUG_ON(!cfqq->queued[rw]); > + cfqq->queued[rw]--; > > elv_rb_del(&cfqq->sort_list, rq); > > @@ -1363,7 +1364,7 @@ static void cfq_add_rq_rb(struct request *rq) > struct cfq_data *cfqd = cfqq->cfqd; > struct request *__alias, *prev; > > - cfqq->queued[rq_is_sync(rq)]++; > + cfqq->queued[rq_data_dir(rq)]++; > > /* > * looks a little odd, but the first insert might return an alias. > @@ -1393,7 +1394,7 @@ static void cfq_add_rq_rb(struct request *rq) > static void cfq_reposition_rq_rb(struct cfq_queue *cfqq, struct request *rq) > { > elv_rb_del(&cfqq->sort_list, rq); > - cfqq->queued[rq_is_sync(rq)]--; > + cfqq->queued[rq_data_dir(rq)]--; > cfq_add_rq_rb(rq); > } > > @@ -1689,7 +1690,8 @@ static struct cfq_queue *cfqq_close(struct cfq_data *cfqd, > struct cfq_queue *__cfqq; > sector_t sector = cfqd->last_position; > > - if (RB_EMPTY_ROOT(root)) > + if (RB_EMPTY_ROOT(root) || > + (blk_queue_nonrot(cfqd->queue) && !cur_cfqq->queued[WRITE])) > return NULL; > > /* > -- > 1.6.4.4 ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] cfq-iosched: non-rot devices do not need read queue merging 2010-01-04 14:47 ` Vivek Goyal @ 2010-01-04 16:36 ` Corrado Zoccolo 2010-01-04 16:51 ` Jeff Moyer 0 siblings, 1 reply; 41+ messages in thread From: Corrado Zoccolo @ 2010-01-04 16:36 UTC (permalink / raw) To: Vivek Goyal Cc: Jens Axboe, Linux-Kernel, Jeff Moyer, Shaohua Li, Gui Jianfeng Hi Vivkek, On Mon, Jan 4, 2010 at 3:47 PM, Vivek Goyal <vgoyal@redhat.com> wrote: > On Wed, Dec 30, 2009 at 11:22:47PM +0100, Corrado Zoccolo wrote: >> Non rotational devices' performances are not affected by >> distance of read requests, so there is no point in having >> overhead to merge such queues. >> This doesn't apply to writes, so this patch changes the >> queued[] field, to be indexed by READ/WRITE instead of >> SYNC/ASYNC, and only compute proximity for queues with >> WRITE requests. >> > > Hi Corrado, > > What's the reason that reads don't benefit from merging queues and hence > merging requests and only writes do on SSD? On SSDs, reads are just limited by the maximum transfer rate, and larger (i.e. merged) reads will just take proportionally longer. >> Signed-off-by: Corrado Zoccolo <czoccolo@gmail.com> >> --- >> block/cfq-iosched.c | 20 +++++++++++--------- >> 1 files changed, 11 insertions(+), 9 deletions(-) >> >> diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c >> index 918c7fd..7da9391 100644 >> --- a/block/cfq-iosched.c >> +++ b/block/cfq-iosched.c >> @@ -108,9 +108,9 @@ struct cfq_queue { >> struct rb_root sort_list; >> /* if fifo isn't expired, next request to serve */ >> struct request *next_rq; >> - /* requests queued in sort_list */ >> + /* requests queued in sort_list, indexed by READ/WRITE */ >> int queued[2]; >> - /* currently allocated requests */ >> + /* currently allocated requests, indexed by READ/WRITE */ >> int allocated[2]; > > Sometime back Jens had changed all READ/WRITE indexing to SYNC/ASYNC > indexing throughout IO schedulers and block layer. Not completely. The allocated field (for which I fixed only the comment) is still addressed as READ/WRITE. > Personally I would > prefer to keep it that way and not have a mix of SYNC/ASYNC and READ/WRITE > indexing in code. I think that, as long as it is documented, it should be fine. > What are we gaining by this patch? Save some cpu cycles by not merging > and splitting the read cfqq on ssd? Yes. We should save a lot of cycles by saving the rb tree management to achieve those operations. Jens' position is that for fast SSDs, we need to save CPU cycles if we want to perform well. > Do you have any numbers how much is > the saving. My knee jerk reaction is that if gains are not significant, > lets not do this optimization and let the code be simple. I think we are actually simplifying the code, removing an optimization (queue merging) when it is not needed. When you want to reason about how the code performs on SSD, removing the unknown of queue merging renders the problem easier. > > >> /* fifo list of requests in sort_list */ >> struct list_head fifo; >> @@ -1268,7 +1268,8 @@ static void cfq_prio_tree_add(struct cfq_data *cfqd, struct cfq_queue *cfqq) >> return; >> if (!cfqq->next_rq) >> return; >> - >> + if (blk_queue_nonrot(cfqd->queue) && !cfqq->queued[WRITE]) >> + return; > > A 1-2 line comment here will help about why writes still benefit and not > reads. > It's because low-end SSDs are penalized by small writes. I don't have an high end SSD to test with, but Jens is going to do more testing, and eventually he can disable merging also for writes if he sees improvement. Note that this is not the usual async write, but sync write with aio, that I think is quite a niche. >> cfqq->p_root = &cfqd->prio_trees[cfqq->org_ioprio]; >> __cfqq = cfq_prio_tree_lookup(cfqd, cfqq->p_root, >> blk_rq_pos(cfqq->next_rq), &parent, &p); >> @@ -1337,10 +1338,10 @@ static void cfq_del_cfqq_rr(struct cfq_data *cfqd, struct cfq_queue *cfqq) >> static void cfq_del_rq_rb(struct request *rq) >> { >> struct cfq_queue *cfqq = RQ_CFQQ(rq); >> - const int sync = rq_is_sync(rq); >> + const int rw = rq_data_dir(rq); >> >> - BUG_ON(!cfqq->queued[sync]); >> - cfqq->queued[sync]--; >> + BUG_ON(!cfqq->queued[rw]); >> + cfqq->queued[rw]--; >> >> elv_rb_del(&cfqq->sort_list, rq); >> >> @@ -1363,7 +1364,7 @@ static void cfq_add_rq_rb(struct request *rq) >> struct cfq_data *cfqd = cfqq->cfqd; >> struct request *__alias, *prev; >> >> - cfqq->queued[rq_is_sync(rq)]++; >> + cfqq->queued[rq_data_dir(rq)]++; >> >> /* >> * looks a little odd, but the first insert might return an alias. >> @@ -1393,7 +1394,7 @@ static void cfq_add_rq_rb(struct request *rq) >> static void cfq_reposition_rq_rb(struct cfq_queue *cfqq, struct request *rq) >> { >> elv_rb_del(&cfqq->sort_list, rq); >> - cfqq->queued[rq_is_sync(rq)]--; >> + cfqq->queued[rq_data_dir(rq)]--; >> cfq_add_rq_rb(rq); >> } >> >> @@ -1689,7 +1690,8 @@ static struct cfq_queue *cfqq_close(struct cfq_data *cfqd, >> struct cfq_queue *__cfqq; >> sector_t sector = cfqd->last_position; >> >> - if (RB_EMPTY_ROOT(root)) >> + if (RB_EMPTY_ROOT(root) || >> + (blk_queue_nonrot(cfqd->queue) && !cur_cfqq->queued[WRITE])) >> return NULL; >> >> /* >> -- >> 1.6.4.4 > Thanks Corrado ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] cfq-iosched: non-rot devices do not need read queue merging 2010-01-04 16:36 ` Corrado Zoccolo @ 2010-01-04 16:51 ` Jeff Moyer 2010-01-04 18:32 ` Vivek Goyal 2010-01-04 18:37 ` Corrado Zoccolo 0 siblings, 2 replies; 41+ messages in thread From: Jeff Moyer @ 2010-01-04 16:51 UTC (permalink / raw) To: Corrado Zoccolo Cc: Vivek Goyal, Jens Axboe, Linux-Kernel, Shaohua Li, Gui Jianfeng Corrado Zoccolo <czoccolo@gmail.com> writes: > Hi Vivkek, > > On Mon, Jan 4, 2010 at 3:47 PM, Vivek Goyal <vgoyal@redhat.com> wrote: >> On Wed, Dec 30, 2009 at 11:22:47PM +0100, Corrado Zoccolo wrote: >>> Non rotational devices' performances are not affected by >>> distance of read requests, so there is no point in having >>> overhead to merge such queues. >>> This doesn't apply to writes, so this patch changes the >>> queued[] field, to be indexed by READ/WRITE instead of >>> SYNC/ASYNC, and only compute proximity for queues with >>> WRITE requests. >>> >> >> Hi Corrado, >> >> What's the reason that reads don't benefit from merging queues and hence >> merging requests and only writes do on SSD? > > On SSDs, reads are just limited by the maximum transfer rate, and > larger (i.e. merged) reads will just take proportionally longer. This is simply not true. You can get more bandwidth from an SSD (I just checked numbers for 2 vendors' devices) by issuing larger read requests, no matter whether the access pattern is sequential or random. Cheers, Jeff ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] cfq-iosched: non-rot devices do not need read queue merging 2010-01-04 16:51 ` Jeff Moyer @ 2010-01-04 18:32 ` Vivek Goyal 2010-01-04 18:37 ` Corrado Zoccolo 1 sibling, 0 replies; 41+ messages in thread From: Vivek Goyal @ 2010-01-04 18:32 UTC (permalink / raw) To: Jeff Moyer Cc: Corrado Zoccolo, Jens Axboe, Linux-Kernel, Shaohua Li, Gui Jianfeng On Mon, Jan 04, 2010 at 11:51:00AM -0500, Jeff Moyer wrote: > Corrado Zoccolo <czoccolo@gmail.com> writes: > > > Hi Vivkek, > > > > On Mon, Jan 4, 2010 at 3:47 PM, Vivek Goyal <vgoyal@redhat.com> wrote: > >> On Wed, Dec 30, 2009 at 11:22:47PM +0100, Corrado Zoccolo wrote: > >>> Non rotational devices' performances are not affected by > >>> distance of read requests, so there is no point in having > >>> overhead to merge such queues. > >>> This doesn't apply to writes, so this patch changes the > >>> queued[] field, to be indexed by READ/WRITE instead of > >>> SYNC/ASYNC, and only compute proximity for queues with > >>> WRITE requests. > >>> > >> > >> Hi Corrado, > >> > >> What's the reason that reads don't benefit from merging queues and hence > >> merging requests and only writes do on SSD? > > > > On SSDs, reads are just limited by the maximum transfer rate, and > > larger (i.e. merged) reads will just take proportionally longer. > > This is simply not true. You can get more bandwidth from an SSD (I just > checked numbers for 2 vendors' devices) by issuing larger read requests, > no matter whether the access pattern is sequential or random. > In my simple testing of 4 fio threads doing direct sequential reads throughput varies significantly if I vary bs from 4K to 128K. bs=4K 65MB/s bs=128K 228MB/s Thanks Vivek ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] cfq-iosched: non-rot devices do not need read queue merging 2010-01-04 16:51 ` Jeff Moyer 2010-01-04 18:32 ` Vivek Goyal @ 2010-01-04 18:37 ` Corrado Zoccolo 2010-01-04 18:51 ` Vivek Goyal 1 sibling, 1 reply; 41+ messages in thread From: Corrado Zoccolo @ 2010-01-04 18:37 UTC (permalink / raw) To: Jeff Moyer Cc: Vivek Goyal, Jens Axboe, Linux-Kernel, Shaohua Li, Gui Jianfeng On Mon, Jan 4, 2010 at 5:51 PM, Jeff Moyer <jmoyer@redhat.com> wrote: > Corrado Zoccolo <czoccolo@gmail.com> writes: > >> Hi Vivkek, >> >> On Mon, Jan 4, 2010 at 3:47 PM, Vivek Goyal <vgoyal@redhat.com> wrote: >>> On Wed, Dec 30, 2009 at 11:22:47PM +0100, Corrado Zoccolo wrote: >>>> Non rotational devices' performances are not affected by >>>> distance of read requests, so there is no point in having >>>> overhead to merge such queues. >>>> This doesn't apply to writes, so this patch changes the >>>> queued[] field, to be indexed by READ/WRITE instead of >>>> SYNC/ASYNC, and only compute proximity for queues with >>>> WRITE requests. >>>> >>> >>> Hi Corrado, >>> >>> What's the reason that reads don't benefit from merging queues and hence >>> merging requests and only writes do on SSD? >> >> On SSDs, reads are just limited by the maximum transfer rate, and >> larger (i.e. merged) reads will just take proportionally longer. > > This is simply not true. You can get more bandwidth from an SSD (I just > checked numbers for 2 vendors' devices) by issuing larger read requests, > no matter whether the access pattern is sequential or random. I know, but the performance increase given the size is sublinear, and the situation here is slightly different. In order for the requests to be merged, they have to be submitted concurrently. So you have to compare 2 concurrent requests of size x with one request of size 2*x (with some CPU overhead). Moreover, you always pay the CPU overhead, even if you can't do the merging, and you must be very lucky to keep merging, because it means the two processes are working in lockstep; it is not sufficient that the requests are just nearby, as for rotational disks. Thanks, Corrado > > Cheers, > Jeff > -- __________________________________________________________________________ dott. Corrado Zoccolo mailto:czoccolo@gmail.com PhD - Department of Computer Science - University of Pisa, Italy -------------------------------------------------------------------------- The self-confidence of a warrior is not the self-confidence of the average man. The average man seeks certainty in the eyes of the onlooker and calls that self-confidence. The warrior seeks impeccability in his own eyes and calls that humbleness. Tales of Power - C. Castaneda ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] cfq-iosched: non-rot devices do not need read queue merging 2010-01-04 18:37 ` Corrado Zoccolo @ 2010-01-04 18:51 ` Vivek Goyal 2010-01-04 19:04 ` Jeff Moyer 0 siblings, 1 reply; 41+ messages in thread From: Vivek Goyal @ 2010-01-04 18:51 UTC (permalink / raw) To: Corrado Zoccolo Cc: Jeff Moyer, Jens Axboe, Linux-Kernel, Shaohua Li, Gui Jianfeng On Mon, Jan 04, 2010 at 07:37:17PM +0100, Corrado Zoccolo wrote: > On Mon, Jan 4, 2010 at 5:51 PM, Jeff Moyer <jmoyer@redhat.com> wrote: > > Corrado Zoccolo <czoccolo@gmail.com> writes: > > > >> Hi Vivkek, > >> > >> On Mon, Jan 4, 2010 at 3:47 PM, Vivek Goyal <vgoyal@redhat.com> wrote: > >>> On Wed, Dec 30, 2009 at 11:22:47PM +0100, Corrado Zoccolo wrote: > >>>> Non rotational devices' performances are not affected by > >>>> distance of read requests, so there is no point in having > >>>> overhead to merge such queues. > >>>> This doesn't apply to writes, so this patch changes the > >>>> queued[] field, to be indexed by READ/WRITE instead of > >>>> SYNC/ASYNC, and only compute proximity for queues with > >>>> WRITE requests. > >>>> > >>> > >>> Hi Corrado, > >>> > >>> What's the reason that reads don't benefit from merging queues and hence > >>> merging requests and only writes do on SSD? > >> > >> On SSDs, reads are just limited by the maximum transfer rate, and > >> larger (i.e. merged) reads will just take proportionally longer. > > > > This is simply not true. You can get more bandwidth from an SSD (I just > > checked numbers for 2 vendors' devices) by issuing larger read requests, > > no matter whether the access pattern is sequential or random. > I know, but the performance increase given the size is sublinear, and > the situation here is slightly different. > In order for the requests to be merged, they have to be submitted concurrently. > So you have to compare 2 concurrent requests of size x with one > request of size 2*x (with some CPU overhead). > Moreover, you always pay the CPU overhead, even if you can't do the > merging, and you must be very lucky to keep merging, because it means > the two processes are working in lockstep; it is not sufficient that > the requests are just nearby, as for rotational disks. > For jeff, at least "dump" utility threads were kind of working in lockstep for writes and he gained significantly by merging these queues together. So the argument is that CPU overhead saving in this case is more substantial as compared to gains made by lockstep read threads. I think we shall have to have some numbers to justify that. Thanks Vivek ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] cfq-iosched: non-rot devices do not need read queue merging 2010-01-04 18:51 ` Vivek Goyal @ 2010-01-04 19:04 ` Jeff Moyer 2010-01-04 20:37 ` Corrado Zoccolo 0 siblings, 1 reply; 41+ messages in thread From: Jeff Moyer @ 2010-01-04 19:04 UTC (permalink / raw) To: Vivek Goyal Cc: Corrado Zoccolo, Jens Axboe, Linux-Kernel, Shaohua Li, Gui Jianfeng Vivek Goyal <vgoyal@redhat.com> writes: > On Mon, Jan 04, 2010 at 07:37:17PM +0100, Corrado Zoccolo wrote: >> On Mon, Jan 4, 2010 at 5:51 PM, Jeff Moyer <jmoyer@redhat.com> wrote: >> > Corrado Zoccolo <czoccolo@gmail.com> writes: >> > >> >> Hi Vivkek, >> >> >> >> On Mon, Jan 4, 2010 at 3:47 PM, Vivek Goyal <vgoyal@redhat.com> wrote: >> >>> On Wed, Dec 30, 2009 at 11:22:47PM +0100, Corrado Zoccolo wrote: >> >>>> Non rotational devices' performances are not affected by >> >>>> distance of read requests, so there is no point in having >> >>>> overhead to merge such queues. >> >>>> This doesn't apply to writes, so this patch changes the >> >>>> queued[] field, to be indexed by READ/WRITE instead of >> >>>> SYNC/ASYNC, and only compute proximity for queues with >> >>>> WRITE requests. >> >>>> >> >>> >> >>> Hi Corrado, >> >>> >> >>> What's the reason that reads don't benefit from merging queues and hence >> >>> merging requests and only writes do on SSD? >> >> >> >> On SSDs, reads are just limited by the maximum transfer rate, and >> >> larger (i.e. merged) reads will just take proportionally longer. >> > >> > This is simply not true. You can get more bandwidth from an SSD (I just >> > checked numbers for 2 vendors' devices) by issuing larger read requests, >> > no matter whether the access pattern is sequential or random. >> I know, but the performance increase given the size is sublinear, and >> the situation here is slightly different. >> In order for the requests to be merged, they have to be submitted concurrently. >> So you have to compare 2 concurrent requests of size x with one >> request of size 2*x (with some CPU overhead). >> Moreover, you always pay the CPU overhead, even if you can't do the >> merging, and you must be very lucky to keep merging, because it means >> the two processes are working in lockstep; it is not sufficient that >> the requests are just nearby, as for rotational disks. >> > > For jeff, at least "dump" utility threads were kind of working in lockstep > for writes and he gained significantly by merging these queues together. Actually, it was for reads. > So the argument is that CPU overhead saving in this case is more substantial > as compared to gains made by lockstep read threads. I think we shall have to > have some numbers to justify that. Agreed. Corrado, I know you don't have the hardware, so I'll give this a run through the read-test2 program and see if it regresses at all. Cheers, Jeff ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] cfq-iosched: non-rot devices do not need read queue merging 2010-01-04 19:04 ` Jeff Moyer @ 2010-01-04 20:37 ` Corrado Zoccolo 2010-01-05 14:58 ` Jeff Moyer 0 siblings, 1 reply; 41+ messages in thread From: Corrado Zoccolo @ 2010-01-04 20:37 UTC (permalink / raw) To: Jeff Moyer Cc: Vivek Goyal, Jens Axboe, Linux-Kernel, Shaohua Li, Gui Jianfeng On Mon, Jan 4, 2010 at 8:04 PM, Jeff Moyer <jmoyer@redhat.com> wrote: > Vivek Goyal <vgoyal@redhat.com> writes: > >> On Mon, Jan 04, 2010 at 07:37:17PM +0100, Corrado Zoccolo wrote: >>> On Mon, Jan 4, 2010 at 5:51 PM, Jeff Moyer <jmoyer@redhat.com> wrote: >>> > Corrado Zoccolo <czoccolo@gmail.com> writes: >>> > >>> >> Hi Vivkek, >>> >> >>> >> On Mon, Jan 4, 2010 at 3:47 PM, Vivek Goyal <vgoyal@redhat.com> wrote: >>> >>> On Wed, Dec 30, 2009 at 11:22:47PM +0100, Corrado Zoccolo wrote: >>> >>>> Non rotational devices' performances are not affected by >>> >>>> distance of read requests, so there is no point in having >>> >>>> overhead to merge such queues. >>> >>>> This doesn't apply to writes, so this patch changes the >>> >>>> queued[] field, to be indexed by READ/WRITE instead of >>> >>>> SYNC/ASYNC, and only compute proximity for queues with >>> >>>> WRITE requests. >>> >>>> >>> >>> >>> >>> Hi Corrado, >>> >>> >>> >>> What's the reason that reads don't benefit from merging queues and hence >>> >>> merging requests and only writes do on SSD? >>> >> >>> >> On SSDs, reads are just limited by the maximum transfer rate, and >>> >> larger (i.e. merged) reads will just take proportionally longer. >>> > >>> > This is simply not true. You can get more bandwidth from an SSD (I just >>> > checked numbers for 2 vendors' devices) by issuing larger read requests, >>> > no matter whether the access pattern is sequential or random. >>> I know, but the performance increase given the size is sublinear, and >>> the situation here is slightly different. >>> In order for the requests to be merged, they have to be submitted concurrently. >>> So you have to compare 2 concurrent requests of size x with one >>> request of size 2*x (with some CPU overhead). >>> Moreover, you always pay the CPU overhead, even if you can't do the >>> merging, and you must be very lucky to keep merging, because it means >>> the two processes are working in lockstep; it is not sufficient that >>> the requests are just nearby, as for rotational disks. >>> >> >> For jeff, at least "dump" utility threads were kind of working in lockstep >> for writes and he gained significantly by merging these queues together. > > Actually, it was for reads. > >> So the argument is that CPU overhead saving in this case is more substantial >> as compared to gains made by lockstep read threads. I think we shall have to >> have some numbers to justify that. > > Agreed. Corrado, I know you don't have the hardware, so I'll give this > a run through the read-test2 program and see if it regresses at all. Great. Thanks a lot, Corrado > > Cheers, > Jeff > ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] cfq-iosched: non-rot devices do not need read queue merging 2010-01-04 20:37 ` Corrado Zoccolo @ 2010-01-05 14:58 ` Jeff Moyer 2010-01-05 15:13 ` Vivek Goyal 0 siblings, 1 reply; 41+ messages in thread From: Jeff Moyer @ 2010-01-05 14:58 UTC (permalink / raw) To: Corrado Zoccolo Cc: Vivek Goyal, Jens Axboe, Linux-Kernel, Shaohua Li, Gui Jianfeng Corrado Zoccolo <czoccolo@gmail.com> writes: > On Mon, Jan 4, 2010 at 8:04 PM, Jeff Moyer <jmoyer@redhat.com> wrote: >> Vivek Goyal <vgoyal@redhat.com> writes: >>>> >>> Hi Corrado, >>>> >>> >>>> >>> What's the reason that reads don't benefit from merging queues and hence >>>> >>> merging requests and only writes do on SSD? >>>> >> >>>> >> On SSDs, reads are just limited by the maximum transfer rate, and >>>> >> larger (i.e. merged) reads will just take proportionally longer. >>>> > >>>> > This is simply not true. You can get more bandwidth from an SSD (I just >>>> > checked numbers for 2 vendors' devices) by issuing larger read requests, >>>> > no matter whether the access pattern is sequential or random. >>>> I know, but the performance increase given the size is sublinear, and >>>> the situation here is slightly different. >>>> In order for the requests to be merged, they have to be submitted concurrently. >>>> So you have to compare 2 concurrent requests of size x with one >>>> request of size 2*x (with some CPU overhead). >>>> Moreover, you always pay the CPU overhead, even if you can't do the >>>> merging, and you must be very lucky to keep merging, because it means >>>> the two processes are working in lockstep; it is not sufficient that >>>> the requests are just nearby, as for rotational disks. >>>> >>> >>> For jeff, at least "dump" utility threads were kind of working in lockstep >>> for writes and he gained significantly by merging these queues together. >> >> Actually, it was for reads. >> >>> So the argument is that CPU overhead saving in this case is more substantial >>> as compared to gains made by lockstep read threads. I think we shall have to >>> have some numbers to justify that. >> >> Agreed. Corrado, I know you don't have the hardware, so I'll give this >> a run through the read-test2 program and see if it regresses at all. > Great. I ran the test program 50 times, and here are the results: ==> vanilla <== Mean: 163.22728 Population Std. Dev.: 0.55401 ==> patched <== Mean: 162.91558 Population Std. Dev.: 1.08612 This looks acceptable to me. Cheers, Jeff ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] cfq-iosched: non-rot devices do not need read queue merging 2010-01-05 14:58 ` Jeff Moyer @ 2010-01-05 15:13 ` Vivek Goyal 2010-01-05 21:19 ` Jeff Moyer 0 siblings, 1 reply; 41+ messages in thread From: Vivek Goyal @ 2010-01-05 15:13 UTC (permalink / raw) To: Jeff Moyer Cc: Corrado Zoccolo, Jens Axboe, Linux-Kernel, Shaohua Li, Gui Jianfeng On Tue, Jan 05, 2010 at 09:58:52AM -0500, Jeff Moyer wrote: > Corrado Zoccolo <czoccolo@gmail.com> writes: > > > On Mon, Jan 4, 2010 at 8:04 PM, Jeff Moyer <jmoyer@redhat.com> wrote: > >> Vivek Goyal <vgoyal@redhat.com> writes: > >>>> >>> Hi Corrado, > >>>> >>> > >>>> >>> What's the reason that reads don't benefit from merging queues and hence > >>>> >>> merging requests and only writes do on SSD? > >>>> >> > >>>> >> On SSDs, reads are just limited by the maximum transfer rate, and > >>>> >> larger (i.e. merged) reads will just take proportionally longer. > >>>> > > >>>> > This is simply not true. You can get more bandwidth from an SSD (I just > >>>> > checked numbers for 2 vendors' devices) by issuing larger read requests, > >>>> > no matter whether the access pattern is sequential or random. > >>>> I know, but the performance increase given the size is sublinear, and > >>>> the situation here is slightly different. > >>>> In order for the requests to be merged, they have to be submitted concurrently. > >>>> So you have to compare 2 concurrent requests of size x with one > >>>> request of size 2*x (with some CPU overhead). > >>>> Moreover, you always pay the CPU overhead, even if you can't do the > >>>> merging, and you must be very lucky to keep merging, because it means > >>>> the two processes are working in lockstep; it is not sufficient that > >>>> the requests are just nearby, as for rotational disks. > >>>> > >>> > >>> For jeff, at least "dump" utility threads were kind of working in lockstep > >>> for writes and he gained significantly by merging these queues together. > >> > >> Actually, it was for reads. > >> > >>> So the argument is that CPU overhead saving in this case is more substantial > >>> as compared to gains made by lockstep read threads. I think we shall have to > >>> have some numbers to justify that. > >> > >> Agreed. Corrado, I know you don't have the hardware, so I'll give this > >> a run through the read-test2 program and see if it regresses at all. > > Great. > > I ran the test program 50 times, and here are the results: > > ==> vanilla <== > Mean: 163.22728 > Population Std. Dev.: 0.55401 > > ==> patched <== > Mean: 162.91558 > Population Std. Dev.: 1.08612 > > This looks acceptable to me. Thanks Jeff, one thing comes to mind. Now with recent changes, we drive deeper depths on SSD with NCQ and there are not many pending cfqq on service tree until and unless number of parallel threads exceed NCQ depth (32). If that's the case, then I think we might not be seeing lot of queue merging happening in this test case until and unless dump utility is creating more than 32 threads. If time permits, it might also be interesting to run the same test with queue depth 1 and see if SSDs without NCQ will suffer or not. Thanks Vivek ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] cfq-iosched: non-rot devices do not need read queue merging 2010-01-05 15:13 ` Vivek Goyal @ 2010-01-05 21:19 ` Jeff Moyer 2010-01-05 21:48 ` Corrado Zoccolo 0 siblings, 1 reply; 41+ messages in thread From: Jeff Moyer @ 2010-01-05 21:19 UTC (permalink / raw) To: Vivek Goyal, Corrado Zoccolo Cc: Jens Axboe, Linux-Kernel, Shaohua Li, Gui Jianfeng Vivek Goyal <vgoyal@redhat.com> writes: > Thanks Jeff, one thing comes to mind. Now with recent changes, we drive deeper > depths on SSD with NCQ and there are not many pending cfqq on service tree > until and unless number of parallel threads exceed NCQ depth (32). If > that's the case, then I think we might not be seeing lot of queue merging > happening in this test case until and unless dump utility is creating more > than 32 threads. > > If time permits, it might also be interesting to run the same test with queue > depth 1 and see if SSDs without NCQ will suffer or not. Corrado, I think what Vivek is getting at is that you should check for both blk_queue_nonrot and cfqd->hw_tag (like in cfq_arm_slice_timer). Do you agree? Cheers, Jeff ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] cfq-iosched: non-rot devices do not need read queue merging 2010-01-05 21:19 ` Jeff Moyer @ 2010-01-05 21:48 ` Corrado Zoccolo 2010-01-07 10:56 ` Kirill Afonshin 2010-01-10 12:55 ` Corrado Zoccolo 0 siblings, 2 replies; 41+ messages in thread From: Corrado Zoccolo @ 2010-01-05 21:48 UTC (permalink / raw) To: Jeff Moyer Cc: Vivek Goyal, Jens Axboe, Linux-Kernel, Shaohua Li, Gui Jianfeng On Tue, Jan 5, 2010 at 10:19 PM, Jeff Moyer <jmoyer@redhat.com> wrote: > Vivek Goyal <vgoyal@redhat.com> writes: > >> Thanks Jeff, one thing comes to mind. Now with recent changes, we drive deeper >> depths on SSD with NCQ and there are not many pending cfqq on service tree >> until and unless number of parallel threads exceed NCQ depth (32). If >> that's the case, then I think we might not be seeing lot of queue merging >> happening in this test case until and unless dump utility is creating more >> than 32 threads. >> >> If time permits, it might also be interesting to run the same test with queue >> depth 1 and see if SSDs without NCQ will suffer or not. > > Corrado, I think what Vivek is getting at is that you should check for > both blk_queue_nonrot and cfqd->hw_tag (like in cfq_arm_slice_timer). > Do you agree? Well, actually I didn't want to distinguish on hw_tag here. I had to still allow merging of writes exactly because a write merge can save hundreds of ms on a non-NCQ SSD. Vivek is right that on non-NCQ SSDs a successful merge would increase the performance, but I still think that the likelyhood of a merge is so low that maintaining the RB-tree is superfluous. Usually, those devices are coupled with low-end CPUs, so saving the code execution could be a win there too. I'll run some tests on my netbook. BTW, I'm looking at read-test2 right now. I see it doesn't use direct I/O, so it relies also on page cache. I think page cache can detect the hidden sequential pattern, and thus send big readahead requests to the device, making merging impossible (on my SSD, readahead size and max hw request size match). Thanks, Corrado > > Cheers, > Jeff > ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] cfq-iosched: non-rot devices do not need read queue merging 2010-01-05 21:48 ` Corrado Zoccolo @ 2010-01-07 10:56 ` Kirill Afonshin 2010-01-07 13:38 ` Corrado Zoccolo 2010-01-10 12:55 ` Corrado Zoccolo 1 sibling, 1 reply; 41+ messages in thread From: Kirill Afonshin @ 2010-01-07 10:56 UTC (permalink / raw) To: Corrado Zoccolo Cc: Jeff Moyer, Vivek Goyal, Jens Axboe, Linux-Kernel, Shaohua Li, Gui Jianfeng I think we should not rely on NCQ/non-NCQ or blk_queue_nonrot() because it may be incorrect: try this: find /sys/ -name rotational 2>/dev/null find /sys/ -name rotational 2>/dev/null|xargs cat all devices are reported as rotational for me including ram, loop and usb flash drive. Physical block size and optimal io size has invalid values for all my usb flash drives. I think it would be better to do a short performance test before mount. It will provide all necessary information for io scheduler. We doesn't need information about NCQ and rotational. We need to predict how much time specific io operation will take in current context. PS: I'm not native speaker. Best regards, Kirill Afonshin ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] cfq-iosched: non-rot devices do not need read queue merging 2010-01-07 10:56 ` Kirill Afonshin @ 2010-01-07 13:38 ` Corrado Zoccolo 2010-01-07 14:36 ` Vivek Goyal 0 siblings, 1 reply; 41+ messages in thread From: Corrado Zoccolo @ 2010-01-07 13:38 UTC (permalink / raw) To: Kirill Afonshin Cc: Jeff Moyer, Vivek Goyal, Jens Axboe, Linux-Kernel, Shaohua Li, Gui Jianfeng Hi Kirill, NCQ is actually measured, so it is reliable. Rotational can be wrong, but you can write to it (e.g. at machine startup) to obtain better handling of your disks. I am also a fan of auto-tuning, and trying to achieve it in cfq, but it is kind of complex, since there are many factors, and some of them aren't modeled by cfq currently. An example is that, on cheap SSDs or flash cards, small writes are far slower than anything else (they can take up to 0.5s), while a read will usually take less than 1ms. Currently, CFQ has no way to handle this extreme situation, since it uses just one idle value (8ms) for all transitions. Corrado On 1/7/10, Kirill Afonshin <kirill_nnov@mail.ru> wrote: > I think we should not rely on NCQ/non-NCQ or blk_queue_nonrot() because it > may be incorrect: > try this: > > find /sys/ -name rotational 2>/dev/null > find /sys/ -name rotational 2>/dev/null|xargs cat > > all devices are reported as rotational for me including ram, loop and usb > flash drive. Physical block size and optimal io size has invalid values for > all my usb flash drives. > > I think it would be better to do a short performance test before mount. It > will provide all necessary information for io scheduler. We doesn't need > information about NCQ and rotational. We need to predict how much time > specific io operation will take in current context. > > PS: I'm not native speaker. > > Best regards, > Kirill Afonshin > ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] cfq-iosched: non-rot devices do not need read queue merging 2010-01-07 13:38 ` Corrado Zoccolo @ 2010-01-07 14:36 ` Vivek Goyal 2010-01-07 17:00 ` Corrado Zoccolo 0 siblings, 1 reply; 41+ messages in thread From: Vivek Goyal @ 2010-01-07 14:36 UTC (permalink / raw) To: Corrado Zoccolo Cc: Kirill Afonshin, Jeff Moyer, Jens Axboe, Linux-Kernel, Shaohua Li, Gui Jianfeng On Thu, Jan 07, 2010 at 02:38:29PM +0100, Corrado Zoccolo wrote: > Hi Kirill, > NCQ is actually measured, so it is reliable. > Rotational can be wrong, but you can write to it (e.g. at machine > startup) to obtain better handling of your disks. > I am also a fan of auto-tuning, and trying to achieve it in cfq, but > it is kind of complex, since there are many factors, and some of them > aren't modeled by cfq currently. > An example is that, on cheap SSDs or flash cards, small writes are far > slower than anything else (they can take up to 0.5s), while a read > will usually take less than 1ms. Currently, CFQ has no way to handle > this extreme situation, since it uses just one idle value (8ms) for > all transitions. Hi Corrado, How does idle time value relate to flash card being slower for writes? If flash card is slow and we choose to idle on queue (because of direct writes), idle time value does not even kick in. We just continue to remain on same cfqq and don't do dispatch from next cfqq. Idle time value will matter only if there was delay from cpu side or from workload side in issuing next request after completion of previous one. Thanks Vivek > > Corrado > > On 1/7/10, Kirill Afonshin <kirill_nnov@mail.ru> wrote: > > I think we should not rely on NCQ/non-NCQ or blk_queue_nonrot() because it > > may be incorrect: > > try this: > > > > find /sys/ -name rotational 2>/dev/null > > find /sys/ -name rotational 2>/dev/null|xargs cat > > > > all devices are reported as rotational for me including ram, loop and usb > > flash drive. Physical block size and optimal io size has invalid values for > > all my usb flash drives. > > > > I think it would be better to do a short performance test before mount. It > > will provide all necessary information for io scheduler. We doesn't need > > information about NCQ and rotational. We need to predict how much time > > specific io operation will take in current context. > > > > PS: I'm not native speaker. > > > > Best regards, > > Kirill Afonshin > > ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] cfq-iosched: non-rot devices do not need read queue merging 2010-01-07 14:36 ` Vivek Goyal @ 2010-01-07 17:00 ` Corrado Zoccolo 2010-01-07 18:37 ` Vivek Goyal 0 siblings, 1 reply; 41+ messages in thread From: Corrado Zoccolo @ 2010-01-07 17:00 UTC (permalink / raw) To: Vivek Goyal Cc: Kirill Afonshin, Jeff Moyer, Jens Axboe, Linux-Kernel, Shaohua Li, Gui Jianfeng On Thu, Jan 7, 2010 at 3:36 PM, Vivek Goyal <vgoyal@redhat.com> wrote: > Hi Corrado, > > How does idle time value relate to flash card being slower for writes? If > flash card is slow and we choose to idle on queue (because of direct > writes), idle time value does not even kick in. We just continue to remain > on same cfqq and don't do dispatch from next cfqq. > > Idle time value will matter only if there was delay from cpu side or from > workload side in issuing next request after completion of previous one. > > Thanks > Vivek Hi Vivek, for me, the optimal idle value should approximate the cost of switching to an other queue. So, for reads, if we are waiting for more than 1 ms, then we are wasting bandwidth. But if we switch from reads to writes (since the reader thought slightly more than 1ms), and the write is really slow, we can have a really long latency before the reader can complete its new request. So the optimal choice would be to have two different idle times, one for switch between readers, and one when switching from readers to writers. Thanks, Corrado ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] cfq-iosched: non-rot devices do not need read queue merging 2010-01-07 17:00 ` Corrado Zoccolo @ 2010-01-07 18:37 ` Vivek Goyal 2010-01-07 20:16 ` Corrado Zoccolo 0 siblings, 1 reply; 41+ messages in thread From: Vivek Goyal @ 2010-01-07 18:37 UTC (permalink / raw) To: Corrado Zoccolo Cc: Kirill Afonshin, Jeff Moyer, Jens Axboe, Linux-Kernel, Shaohua Li, Gui Jianfeng On Thu, Jan 07, 2010 at 06:00:54PM +0100, Corrado Zoccolo wrote: > On Thu, Jan 7, 2010 at 3:36 PM, Vivek Goyal <vgoyal@redhat.com> wrote: > > Hi Corrado, > > > > How does idle time value relate to flash card being slower for writes? If > > flash card is slow and we choose to idle on queue (because of direct > > writes), idle time value does not even kick in. We just continue to remain > > on same cfqq and don't do dispatch from next cfqq. > > > > Idle time value will matter only if there was delay from cpu side or from > > workload side in issuing next request after completion of previous one. > > > > Thanks > > Vivek > Hi Vivek, > for me, the optimal idle value should approximate the cost of > switching to an other queue. > So, for reads, if we are waiting for more than 1 ms, then we are > wasting bandwidth. > But if we switch from reads to writes (since the reader thought > slightly more than 1ms), and the write is really slow, we can have a > really long latency before the reader can complete its new request. What workload do you have where reader is thinking more than a 1ms? To me one issue probably is that for sync queues we drive shallow (1-2) queue depths and this can be an issue on high end storage where there can be multiple disks behind the array and this sync queue is just not keeping array fully utilized. Buffered sequential reads mitigate this issue up to some extent as requests size is big. Idling on the queue helps in providing differentiated service for higher priority queue and also helps to get more out of disk on rotational media with single disk. But I suspect that on big arrays, this idling on sync queues and not driving deeper queue depths might hurt. So if we had a way to detect that we got a big storage array underneath, may be we can get more throughput by not idling at all. But we will also loose the service differentiation between various ioprio queues. I guess your patches of monitoring service times might be useful here. > So the optimal choice would be to have two different idle times, one > for switch between readers, and one when switching from readers to > writers. Sounds like read and write batches. With you workload type, we are already doing it. Idle per service tree. At least it solves the problem for sync-noidle queues where we don't idle between read queues but do idle between read and buffered write (async queues). In my testing so far, I have not encountered the workloads where readers are thinking a lot. Think time has been very small. Thanks Vivek ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] cfq-iosched: non-rot devices do not need read queue merging 2010-01-07 18:37 ` Vivek Goyal @ 2010-01-07 20:16 ` Corrado Zoccolo 2010-01-08 18:53 ` Vivek Goyal 0 siblings, 1 reply; 41+ messages in thread From: Corrado Zoccolo @ 2010-01-07 20:16 UTC (permalink / raw) To: Vivek Goyal Cc: Kirill Afonshin, Jeff Moyer, Jens Axboe, Linux-Kernel, Shaohua Li, Gui Jianfeng On Thu, Jan 7, 2010 at 7:37 PM, Vivek Goyal <vgoyal@redhat.com> wrote: > On Thu, Jan 07, 2010 at 06:00:54PM +0100, Corrado Zoccolo wrote: >> On Thu, Jan 7, 2010 at 3:36 PM, Vivek Goyal <vgoyal@redhat.com> wrote: >> > Hi Corrado, >> > >> > How does idle time value relate to flash card being slower for writes? If >> > flash card is slow and we choose to idle on queue (because of direct >> > writes), idle time value does not even kick in. We just continue to remain >> > on same cfqq and don't do dispatch from next cfqq. >> > >> > Idle time value will matter only if there was delay from cpu side or from >> > workload side in issuing next request after completion of previous one. >> > >> > Thanks >> > Vivek >> Hi Vivek, >> for me, the optimal idle value should approximate the cost of >> switching to an other queue. >> So, for reads, if we are waiting for more than 1 ms, then we are >> wasting bandwidth. >> But if we switch from reads to writes (since the reader thought >> slightly more than 1ms), and the write is really slow, we can have a >> really long latency before the reader can complete its new request. > > What workload do you have where reader is thinking more than a 1ms? My representative workload is booting my netbook. I found that if I let cfq autotune to a lower slice idle, boot slows down, and bootchart clearly shows that I/O wait increases and I/O bandwidth decreases. This tells me that the writes are getting into the picture earlier than with 8ms idle, and causing a regression. Note that the reader doesn't need to be one. I could have a set of readers, and I want to switch between them in 1ms, but idle up to 10ms or more before switching to async writes. > > To me one issue probably is that for sync queues we drive shallow (1-2) > queue depths and this can be an issue on high end storage where there > can be multiple disks behind the array and this sync queue is just > not keeping array fully utilized. Buffered sequential reads mitigate > this issue up to some extent as requests size is big. I think for sequential queues, you should tune your readahead to hit all the disks of the raid. In that case, idling makes sense, because all the disks will now be ready to serve the new request immediately. > > Idling on the queue helps in providing differentiated service for higher > priority queue and also helps to get more out of disk on rotational media > with single disk. But I suspect that on big arrays, this idling on sync > queues and not driving deeper queue depths might hurt. We should have some numbers to support. In all tests I saw, setting slice idle to 0 causes regression also on decently sized arrays, at least when the number of concurrent processes is big enough that 2 of them likely will make requests to the same disk (and by the birthday paradox, this can be a quite small number, even with very large arrays: e.g. with 365-disk raids, 23 concurrent processes have 50% probability of colliding on the same disk at every single request sent). > > So if we had a way to detect that we got a big storage array underneath, > may be we can get more throughput by not idling at all. But we will also > loose the service differentiation between various ioprio queues. I guess > your patches of monitoring service times might be useful here. It might, but we need to identify an hardware in which not idling is beneficial, measure its behaviour and see which measurable parameter can clearly distinguish it from other hardware where idling is required. If we are speaking of raid of rotational disks, seek time (which I was measuring) is not a good parameter, because it can be still high. > >> So the optimal choice would be to have two different idle times, one >> for switch between readers, and one when switching from readers to >> writers. > > Sounds like read and write batches. With you workload type, we are already > doing it. Idle per service tree. At least it solves the problem for > sync-noidle queues where we don't idle between read queues but do idle > between read and buffered write (async queues). > In fact those changes improved my netbook boot time a lot, and I'm not even using sreadahead. But if autotuning reduces the slice idle, then I see again the huge penalty of small writes. > In my testing so far, I have not encountered the workloads where readers > are thinking a lot. Think time has been very small. Sometimes real workloads have more variable think times than our syntetic benchmarks. > > Thanks > Vivek > Thanks, Corrado ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] cfq-iosched: non-rot devices do not need read queue merging 2010-01-07 20:16 ` Corrado Zoccolo @ 2010-01-08 18:53 ` Vivek Goyal 0 siblings, 0 replies; 41+ messages in thread From: Vivek Goyal @ 2010-01-08 18:53 UTC (permalink / raw) To: Corrado Zoccolo Cc: Kirill Afonshin, Jeff Moyer, Jens Axboe, Linux-Kernel, Shaohua Li, Gui Jianfeng On Thu, Jan 07, 2010 at 09:16:30PM +0100, Corrado Zoccolo wrote: > On Thu, Jan 7, 2010 at 7:37 PM, Vivek Goyal <vgoyal@redhat.com> wrote: > > On Thu, Jan 07, 2010 at 06:00:54PM +0100, Corrado Zoccolo wrote: > >> On Thu, Jan 7, 2010 at 3:36 PM, Vivek Goyal <vgoyal@redhat.com> wrote: > >> > Hi Corrado, > >> > > >> > How does idle time value relate to flash card being slower for writes? If > >> > flash card is slow and we choose to idle on queue (because of direct > >> > writes), idle time value does not even kick in. We just continue to remain > >> > on same cfqq and don't do dispatch from next cfqq. > >> > > >> > Idle time value will matter only if there was delay from cpu side or from > >> > workload side in issuing next request after completion of previous one. > >> > > >> > Thanks > >> > Vivek > >> Hi Vivek, > >> for me, the optimal idle value should approximate the cost of > >> switching to an other queue. > >> So, for reads, if we are waiting for more than 1 ms, then we are > >> wasting bandwidth. > >> But if we switch from reads to writes (since the reader thought > >> slightly more than 1ms), and the write is really slow, we can have a > >> really long latency before the reader can complete its new request. > > > > What workload do you have where reader is thinking more than a 1ms? > My representative workload is booting my netbook. I found that if I > let cfq autotune to a lower slice idle, boot slows down, and bootchart > clearly shows that I/O wait increases and I/O bandwidth decreases. > This tells me that the writes are getting into the picture earlier > than with 8ms idle, and causing a regression. > Note that the reader doesn't need to be one. I could have a set of > readers, and I want to switch between them in 1ms, but idle up to 10ms > or more before switching to async writes. Ok, so booting on your netbook where write cost is high is the case. So in this particular case you prefer to delay writes a bit to reduce the read latency and writes can catch up little later. > > > > To me one issue probably is that for sync queues we drive shallow (1-2) > > queue depths and this can be an issue on high end storage where there > > can be multiple disks behind the array and this sync queue is just > > not keeping array fully utilized. Buffered sequential reads mitigate > > this issue up to some extent as requests size is big. > I think for sequential queues, you should tune your readahead to hit > all the disks of the raid. In that case, idling makes sense, because > all the disks will now be ready to serve the new request immediately. > > > > > Idling on the queue helps in providing differentiated service for higher > > priority queue and also helps to get more out of disk on rotational media > > with single disk. But I suspect that on big arrays, this idling on sync > > queues and not driving deeper queue depths might hurt. > We should have some numbers to support. In all tests I saw, setting > slice idle to 0 causes regression also on decently sized arrays, at > least when the number of concurrent processes is big enough that 2 of > them likely will make requests to the same disk (and by the birthday > paradox, this can be a quite small number, even with very large > arrays: e.g. with 365-disk raids, 23 concurrent processes have 50% > probability of colliding on the same disk at every single request > sent). I will do some tests and see if there are cases where driving shallower depths hurts. Vivek > > > > > So if we had a way to detect that we got a big storage array underneath, > > may be we can get more throughput by not idling at all. But we will also > > loose the service differentiation between various ioprio queues. I guess > > your patches of monitoring service times might be useful here. > It might, but we need to identify an hardware in which not idling is > beneficial, measure its behaviour and see which measurable parameter > can clearly distinguish it from other hardware where idling is > required. If we are speaking of raid of rotational disks, seek time > (which I was measuring) is not a good parameter, because it can be > still high. > > > >> So the optimal choice would be to have two different idle times, one > >> for switch between readers, and one when switching from readers to > >> writers. > > > > Sounds like read and write batches. With you workload type, we are already > > doing it. Idle per service tree. At least it solves the problem for > > sync-noidle queues where we don't idle between read queues but do idle > > between read and buffered write (async queues). > > > In fact those changes improved my netbook boot time a lot, and I'm not > even using sreadahead. But if autotuning reduces the slice idle, then > I see again the huge penalty of small writes. > > > In my testing so far, I have not encountered the workloads where readers > > are thinking a lot. Think time has been very small. > Sometimes real workloads have more variable think times than our > syntetic benchmarks. > > > > > Thanks > > Vivek > > > Thanks, > Corrado ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] cfq-iosched: non-rot devices do not need read queue merging 2010-01-05 21:48 ` Corrado Zoccolo 2010-01-07 10:56 ` Kirill Afonshin @ 2010-01-10 12:55 ` Corrado Zoccolo 1 sibling, 0 replies; 41+ messages in thread From: Corrado Zoccolo @ 2010-01-10 12:55 UTC (permalink / raw) To: Jeff Moyer Cc: Vivek Goyal, Jens Axboe, Linux-Kernel, Shaohua Li, Gui Jianfeng Hi, On Tue, Jan 5, 2010 at 10:48 PM, Corrado Zoccolo <czoccolo@gmail.com> wrote: > On Tue, Jan 5, 2010 at 10:19 PM, Jeff Moyer <jmoyer@redhat.com> wrote: >> Vivek Goyal <vgoyal@redhat.com> writes: >> >>> Thanks Jeff, one thing comes to mind. Now with recent changes, we drive deeper >>> depths on SSD with NCQ and there are not many pending cfqq on service tree >>> until and unless number of parallel threads exceed NCQ depth (32). If >>> that's the case, then I think we might not be seeing lot of queue merging >>> happening in this test case until and unless dump utility is creating more >>> than 32 threads. >>> >>> If time permits, it might also be interesting to run the same test with queue >>> depth 1 and see if SSDs without NCQ will suffer or not. >> >> Corrado, I think what Vivek is getting at is that you should check for >> both blk_queue_nonrot and cfqd->hw_tag (like in cfq_arm_slice_timer). >> Do you agree? > Well, actually I didn't want to distinguish on hw_tag here. I had to > still allow merging of writes exactly because a write merge can save > hundreds of ms on a non-NCQ SSD. > > Vivek is right that on non-NCQ SSDs a successful merge would increase > the performance, but I still think that the likelyhood of a merge is > so low that maintaining the RB-tree is superfluous. Usually, those > devices are coupled with low-end CPUs, so saving the code execution > could be a win there too. I'll run some tests on my netbook. > > BTW, I'm looking at read-test2 right now. I see it doesn't use direct > I/O, so it relies also on page cache. I think page cache can detect > the hidden sequential pattern, and thus send big readahead requests to > the device, making merging impossible (on my SSD, readahead size and > max hw request size match). > I did some tests, and found a surprising thing. Simply running the test script, the BW levels to a high BW value, regardless of queue merging in CFQ is enabled or disabled. I suspected something odd was going on, so I modified the script to drop caches before each run, and now I found that with queue merging it is 3 times faster than without, so on non-ncq SSD it is better to have queue merging enabled, after all. I'm wondering why the page cache being full can give so large results. My disk is 4 times larger than the available RAM, so it should do just a 1/4 boost not clearing it. I have to do more tests to understand what's going on... Thanks, Corrado > >> >> Cheers, >> Jeff >> > -- __________________________________________________________________________ dott. Corrado Zoccolo mailto:czoccolo@gmail.com PhD - Department of Computer Science - University of Pisa, Italy -------------------------------------------------------------------------- The self-confidence of a warrior is not the self-confidence of the average man. The average man seeks certainty in the eyes of the onlooker and calls that self-confidence. The warrior seeks impeccability in his own eyes and calls that humbleness. Tales of Power - C. Castaneda ^ permalink raw reply [flat|nested] 41+ messages in thread
* [PATCH] cfq-iosched: NCQ SSDs do not need read queue merging 2009-12-30 22:22 ` [PATCH] cfq-iosched: non-rot devices do not need read " Corrado Zoccolo 2010-01-04 14:47 ` Vivek Goyal @ 2010-01-10 21:04 ` Corrado Zoccolo 2010-01-10 21:08 ` Corrado Zoccolo 2010-01-11 11:25 ` Jeff Garzik 1 sibling, 2 replies; 41+ messages in thread From: Corrado Zoccolo @ 2010-01-10 21:04 UTC (permalink / raw) To: Jens Axboe Cc: Linux-Kernel, Jeff Moyer, Vivek Goyal, Shaohua Li, Gui Jianfeng, Corrado Zoccolo NCQ SSDs' performances are not affected by distance of read requests, so there is no point in having overhead to merge such queues. Non-NCQ SSDs showed regression in some special cases, so they are ruled out by this patch. This patch intentionally doesn't affect writes, so it changes the queued[] field, to be indexed by READ/WRITE instead of SYNC/ASYNC, and only compute proximity for queues with WRITE requests. Signed-off-by: Corrado Zoccolo <czoccolo@gmail.com> --- block/cfq-iosched.c | 28 +++++++++++++++++----------- 1 files changed, 17 insertions(+), 11 deletions(-) diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c index 918c7fd..3b7c60e 100644 --- a/block/cfq-iosched.c +++ b/block/cfq-iosched.c @@ -108,9 +108,9 @@ struct cfq_queue { struct rb_root sort_list; /* if fifo isn't expired, next request to serve */ struct request *next_rq; - /* requests queued in sort_list */ + /* requests queued in sort_list, indexed by READ/WRITE */ int queued[2]; - /* currently allocated requests */ + /* currently allocated requests, indexed by READ/WRITE */ int allocated[2]; /* fifo list of requests in sort_list */ struct list_head fifo; @@ -436,6 +436,10 @@ static inline void cic_set_cfqq(struct cfq_io_context *cic, cic->cfqq[is_sync] = cfqq; } +static inline bool is_smart_ssd(struct cfq_queue *cfqq) +{ + return blk_queue_nonrot(cfqd->queue) && cfqd->hw_tag; +} /* * We regard a request as SYNC, if it's either a read or has the SYNC bit * set (in which case it could also be direct WRITE). @@ -1268,7 +1272,8 @@ static void cfq_prio_tree_add(struct cfq_data *cfqd, struct cfq_queue *cfqq) return; if (!cfqq->next_rq) return; - + if (is_smart_ssd(cfqd) && !cfqq->queued[WRITE]) + return; cfqq->p_root = &cfqd->prio_trees[cfqq->org_ioprio]; __cfqq = cfq_prio_tree_lookup(cfqd, cfqq->p_root, blk_rq_pos(cfqq->next_rq), &parent, &p); @@ -1337,10 +1342,10 @@ static void cfq_del_cfqq_rr(struct cfq_data *cfqd, struct cfq_queue *cfqq) static void cfq_del_rq_rb(struct request *rq) { struct cfq_queue *cfqq = RQ_CFQQ(rq); - const int sync = rq_is_sync(rq); + const int rw = rq_data_dir(rq); - BUG_ON(!cfqq->queued[sync]); - cfqq->queued[sync]--; + BUG_ON(!cfqq->queued[rw]); + cfqq->queued[rw]--; elv_rb_del(&cfqq->sort_list, rq); @@ -1363,7 +1368,7 @@ static void cfq_add_rq_rb(struct request *rq) struct cfq_data *cfqd = cfqq->cfqd; struct request *__alias, *prev; - cfqq->queued[rq_is_sync(rq)]++; + cfqq->queued[rq_data_dir(rq)]++; /* * looks a little odd, but the first insert might return an alias. @@ -1393,7 +1398,7 @@ static void cfq_add_rq_rb(struct request *rq) static void cfq_reposition_rq_rb(struct cfq_queue *cfqq, struct request *rq) { elv_rb_del(&cfqq->sort_list, rq); - cfqq->queued[rq_is_sync(rq)]--; + cfqq->queued[rq_data_dir(rq)]--; cfq_add_rq_rb(rq); } @@ -1689,7 +1694,8 @@ static struct cfq_queue *cfqq_close(struct cfq_data *cfqd, struct cfq_queue *__cfqq; sector_t sector = cfqd->last_position; - if (RB_EMPTY_ROOT(root)) + if (RB_EMPTY_ROOT(root) || + (is_smart_ssd(cfqd) && !cur_cfqq->queued[WRITE])) return NULL; /* @@ -1796,7 +1802,7 @@ static bool cfq_should_idle(struct cfq_data *cfqd, struct cfq_queue *cfqq) /* We do for queues that were marked with idle window flag. */ if (cfq_cfqq_idle_window(cfqq) && - !(blk_queue_nonrot(cfqd->queue) && cfqd->hw_tag)) + !is_smart_ssd(cfqd)) return true; /* @@ -1817,7 +1823,7 @@ static void cfq_arm_slice_timer(struct cfq_data *cfqd) * for devices that support queuing, otherwise we still have a problem * with sync vs async workloads. */ - if (blk_queue_nonrot(cfqd->queue) && cfqd->hw_tag) + if (is_smart_ssd(cfqd)) return; WARN_ON(!RB_EMPTY_ROOT(&cfqq->sort_list)); -- 1.6.4.4 ^ permalink raw reply related [flat|nested] 41+ messages in thread
* [PATCH] cfq-iosched: NCQ SSDs do not need read queue merging 2010-01-10 21:04 ` [PATCH] cfq-iosched: NCQ SSDs " Corrado Zoccolo @ 2010-01-10 21:08 ` Corrado Zoccolo 2010-01-11 11:25 ` Jeff Garzik 1 sibling, 0 replies; 41+ messages in thread From: Corrado Zoccolo @ 2010-01-10 21:08 UTC (permalink / raw) To: Jens Axboe Cc: Linux-Kernel, Jeff Moyer, Vivek Goyal, Shaohua Li, Gui Jianfeng, Corrado Zoccolo NCQ SSDs' performances are not affected by distance of read requests, so there is no point in having overhead to merge such queues. Non-NCQ SSDs showed regression in some special cases, so they are ruled out by this patch. This patch intentionally doesn't affect writes, so it changes the queued[] field, to be indexed by READ/WRITE instead of SYNC/ASYNC, and only compute proximity for queues with WRITE requests. Signed-off-by: Corrado Zoccolo <czoccolo@gmail.com> --- block/cfq-iosched.c | 28 +++++++++++++++++----------- 1 files changed, 17 insertions(+), 11 deletions(-) diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c index 918c7fd..18bce18 100644 --- a/block/cfq-iosched.c +++ b/block/cfq-iosched.c @@ -108,9 +108,9 @@ struct cfq_queue { struct rb_root sort_list; /* if fifo isn't expired, next request to serve */ struct request *next_rq; - /* requests queued in sort_list */ + /* requests queued in sort_list, indexed by READ/WRITE */ int queued[2]; - /* currently allocated requests */ + /* currently allocated requests, indexed by READ/WRITE */ int allocated[2]; /* fifo list of requests in sort_list */ struct list_head fifo; @@ -436,6 +436,10 @@ static inline void cic_set_cfqq(struct cfq_io_context *cic, cic->cfqq[is_sync] = cfqq; } +static inline bool is_smart_ssd(struct cfq_data *cfqd) +{ + return blk_queue_nonrot(cfqd->queue) && cfqd->hw_tag; +} /* * We regard a request as SYNC, if it's either a read or has the SYNC bit * set (in which case it could also be direct WRITE). @@ -1268,7 +1272,8 @@ static void cfq_prio_tree_add(struct cfq_data *cfqd, struct cfq_queue *cfqq) return; if (!cfqq->next_rq) return; - + if (is_smart_ssd(cfqd) && !cfqq->queued[WRITE]) + return; cfqq->p_root = &cfqd->prio_trees[cfqq->org_ioprio]; __cfqq = cfq_prio_tree_lookup(cfqd, cfqq->p_root, blk_rq_pos(cfqq->next_rq), &parent, &p); @@ -1337,10 +1342,10 @@ static void cfq_del_cfqq_rr(struct cfq_data *cfqd, struct cfq_queue *cfqq) static void cfq_del_rq_rb(struct request *rq) { struct cfq_queue *cfqq = RQ_CFQQ(rq); - const int sync = rq_is_sync(rq); + const int rw = rq_data_dir(rq); - BUG_ON(!cfqq->queued[sync]); - cfqq->queued[sync]--; + BUG_ON(!cfqq->queued[rw]); + cfqq->queued[rw]--; elv_rb_del(&cfqq->sort_list, rq); @@ -1363,7 +1368,7 @@ static void cfq_add_rq_rb(struct request *rq) struct cfq_data *cfqd = cfqq->cfqd; struct request *__alias, *prev; - cfqq->queued[rq_is_sync(rq)]++; + cfqq->queued[rq_data_dir(rq)]++; /* * looks a little odd, but the first insert might return an alias. @@ -1393,7 +1398,7 @@ static void cfq_add_rq_rb(struct request *rq) static void cfq_reposition_rq_rb(struct cfq_queue *cfqq, struct request *rq) { elv_rb_del(&cfqq->sort_list, rq); - cfqq->queued[rq_is_sync(rq)]--; + cfqq->queued[rq_data_dir(rq)]--; cfq_add_rq_rb(rq); } @@ -1689,7 +1694,8 @@ static struct cfq_queue *cfqq_close(struct cfq_data *cfqd, struct cfq_queue *__cfqq; sector_t sector = cfqd->last_position; - if (RB_EMPTY_ROOT(root)) + if (RB_EMPTY_ROOT(root) || + (is_smart_ssd(cfqd) && !cur_cfqq->queued[WRITE])) return NULL; /* @@ -1796,7 +1802,7 @@ static bool cfq_should_idle(struct cfq_data *cfqd, struct cfq_queue *cfqq) /* We do for queues that were marked with idle window flag. */ if (cfq_cfqq_idle_window(cfqq) && - !(blk_queue_nonrot(cfqd->queue) && cfqd->hw_tag)) + !is_smart_ssd(cfqd)) return true; /* @@ -1817,7 +1823,7 @@ static void cfq_arm_slice_timer(struct cfq_data *cfqd) * for devices that support queuing, otherwise we still have a problem * with sync vs async workloads. */ - if (blk_queue_nonrot(cfqd->queue) && cfqd->hw_tag) + if (is_smart_ssd(cfqd)) return; WARN_ON(!RB_EMPTY_ROOT(&cfqq->sort_list)); -- 1.6.4.4 ^ permalink raw reply related [flat|nested] 41+ messages in thread
* Re: [PATCH] cfq-iosched: NCQ SSDs do not need read queue merging 2010-01-10 21:04 ` [PATCH] cfq-iosched: NCQ SSDs " Corrado Zoccolo 2010-01-10 21:08 ` Corrado Zoccolo @ 2010-01-11 11:25 ` Jeff Garzik 2010-01-11 12:26 ` Corrado Zoccolo 1 sibling, 1 reply; 41+ messages in thread From: Jeff Garzik @ 2010-01-11 11:25 UTC (permalink / raw) To: Corrado Zoccolo Cc: Jens Axboe, Linux-Kernel, Jeff Moyer, Vivek Goyal, Shaohua Li, Gui Jianfeng On 01/10/2010 04:04 PM, Corrado Zoccolo wrote: > NCQ SSDs' performances are not affected by > distance of read requests, so there is no point in having > overhead to merge such queues. > > Non-NCQ SSDs showed regression in some special cases, so > they are ruled out by this patch. > > This patch intentionally doesn't affect writes, so > it changes the queued[] field, to be indexed by > READ/WRITE instead of SYNC/ASYNC, and only compute proximity > for queues with WRITE requests. > > Signed-off-by: Corrado Zoccolo<czoccolo@gmail.com> That's not really true. Overhead always increases as the total number of ATA commands issued increases. Jeff ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] cfq-iosched: NCQ SSDs do not need read queue merging 2010-01-11 11:25 ` Jeff Garzik @ 2010-01-11 12:26 ` Corrado Zoccolo 2010-01-11 13:13 ` Jens Axboe 0 siblings, 1 reply; 41+ messages in thread From: Corrado Zoccolo @ 2010-01-11 12:26 UTC (permalink / raw) To: Jeff Garzik Cc: Jens Axboe, Linux-Kernel, Jeff Moyer, Vivek Goyal, Shaohua Li, Gui Jianfeng On Mon, Jan 11, 2010 at 12:25 PM, Jeff Garzik <jeff@garzik.org> wrote: > On 01/10/2010 04:04 PM, Corrado Zoccolo wrote: >> >> NCQ SSDs' performances are not affected by >> distance of read requests, so there is no point in having >> overhead to merge such queues. >> >> Non-NCQ SSDs showed regression in some special cases, so >> they are ruled out by this patch. >> >> This patch intentionally doesn't affect writes, so >> it changes the queued[] field, to be indexed by >> READ/WRITE instead of SYNC/ASYNC, and only compute proximity >> for queues with WRITE requests. >> >> Signed-off-by: Corrado Zoccolo<czoccolo@gmail.com> > > That's not really true. Overhead always increases as the total number of > ATA commands issued increases. Jeff Moyer tested the patch on the workload that mostly benefit of queue merging, and found that the performance was improved by the patch. So removing the CPU overhead helps much more than the marginal gain given by merging on this hardware. Corrado > > Jeff > > > > > ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] cfq-iosched: NCQ SSDs do not need read queue merging 2010-01-11 12:26 ` Corrado Zoccolo @ 2010-01-11 13:13 ` Jens Axboe 2010-01-11 13:18 ` Jeff Garzik 0 siblings, 1 reply; 41+ messages in thread From: Jens Axboe @ 2010-01-11 13:13 UTC (permalink / raw) To: Corrado Zoccolo Cc: Jeff Garzik, Linux-Kernel, Jeff Moyer, Vivek Goyal, Shaohua Li, Gui Jianfeng On Mon, Jan 11 2010, Corrado Zoccolo wrote: > On Mon, Jan 11, 2010 at 12:25 PM, Jeff Garzik <jeff@garzik.org> wrote: > > On 01/10/2010 04:04 PM, Corrado Zoccolo wrote: > >> > >> NCQ SSDs' performances are not affected by > >> distance of read requests, so there is no point in having > >> overhead to merge such queues. > >> > >> Non-NCQ SSDs showed regression in some special cases, so > >> they are ruled out by this patch. > >> > >> This patch intentionally doesn't affect writes, so > >> it changes the queued[] field, to be indexed by > >> READ/WRITE instead of SYNC/ASYNC, and only compute proximity > >> for queues with WRITE requests. > >> > >> Signed-off-by: Corrado Zoccolo<czoccolo@gmail.com> > > > > That's not really true. Overhead always increases as the total number of > > ATA commands issued increases. > > Jeff Moyer tested the patch on the workload that mostly benefit of > queue merging, and found that > the performance was improved by the patch. > So removing the CPU overhead helps much more than the marginal gain > given by merging on this hardware. It's not always going to be true. On SATA the command overhead is fairly low, but on other hardware that may not be the case. Unless you are CPU bound by your IO device, then merging will always be beneficial. I'm a little behind on emails after my vacation, Jeff what numbers did you generate and on what hardware? -- Jens Axboe ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] cfq-iosched: NCQ SSDs do not need read queue merging 2010-01-11 13:13 ` Jens Axboe @ 2010-01-11 13:18 ` Jeff Garzik 2010-01-11 13:24 ` Jens Axboe 2010-01-11 14:53 ` Corrado Zoccolo 0 siblings, 2 replies; 41+ messages in thread From: Jeff Garzik @ 2010-01-11 13:18 UTC (permalink / raw) To: Jens Axboe Cc: Corrado Zoccolo, Linux-Kernel, Jeff Moyer, Vivek Goyal, Shaohua Li, Gui Jianfeng On 01/11/2010 08:13 AM, Jens Axboe wrote: > On Mon, Jan 11 2010, Corrado Zoccolo wrote: >> On Mon, Jan 11, 2010 at 12:25 PM, Jeff Garzik<jeff@garzik.org> wrote: >>> On 01/10/2010 04:04 PM, Corrado Zoccolo wrote: >>>> >>>> NCQ SSDs' performances are not affected by >>>> distance of read requests, so there is no point in having >>>> overhead to merge such queues. >>>> >>>> Non-NCQ SSDs showed regression in some special cases, so >>>> they are ruled out by this patch. >>>> >>>> This patch intentionally doesn't affect writes, so >>>> it changes the queued[] field, to be indexed by >>>> READ/WRITE instead of SYNC/ASYNC, and only compute proximity >>>> for queues with WRITE requests. >>>> >>>> Signed-off-by: Corrado Zoccolo<czoccolo@gmail.com> >>> >>> That's not really true. Overhead always increases as the total number of >>> ATA commands issued increases. >> >> Jeff Moyer tested the patch on the workload that mostly benefit of >> queue merging, and found that >> the performance was improved by the patch. >> So removing the CPU overhead helps much more than the marginal gain >> given by merging on this hardware. > > It's not always going to be true. On SATA the command overhead is fairly > low, but on other hardware that may not be the case. Unless you are CPU > bound by your IO device, then merging will always be beneficial. I'm a > little behind on emails after my vacation, Jeff what numbers did you > generate and on what hardware? ...and on what workload? "the workload that mostly benefit of queue merging" is highly subjective, and likely does not cover most workloads SSDs will see in the field. Jeff ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] cfq-iosched: NCQ SSDs do not need read queue merging 2010-01-11 13:18 ` Jeff Garzik @ 2010-01-11 13:24 ` Jens Axboe 2010-01-11 14:53 ` Corrado Zoccolo 1 sibling, 0 replies; 41+ messages in thread From: Jens Axboe @ 2010-01-11 13:24 UTC (permalink / raw) To: Jeff Garzik Cc: Corrado Zoccolo, Linux-Kernel, Jeff Moyer, Vivek Goyal, Shaohua Li, Gui Jianfeng On Mon, Jan 11 2010, Jeff Garzik wrote: > On 01/11/2010 08:13 AM, Jens Axboe wrote: >> On Mon, Jan 11 2010, Corrado Zoccolo wrote: >>> On Mon, Jan 11, 2010 at 12:25 PM, Jeff Garzik<jeff@garzik.org> wrote: >>>> On 01/10/2010 04:04 PM, Corrado Zoccolo wrote: >>>>> >>>>> NCQ SSDs' performances are not affected by >>>>> distance of read requests, so there is no point in having >>>>> overhead to merge such queues. >>>>> >>>>> Non-NCQ SSDs showed regression in some special cases, so >>>>> they are ruled out by this patch. >>>>> >>>>> This patch intentionally doesn't affect writes, so >>>>> it changes the queued[] field, to be indexed by >>>>> READ/WRITE instead of SYNC/ASYNC, and only compute proximity >>>>> for queues with WRITE requests. >>>>> >>>>> Signed-off-by: Corrado Zoccolo<czoccolo@gmail.com> >>>> >>>> That's not really true. Overhead always increases as the total number of >>>> ATA commands issued increases. >>> >>> Jeff Moyer tested the patch on the workload that mostly benefit of >>> queue merging, and found that >>> the performance was improved by the patch. >>> So removing the CPU overhead helps much more than the marginal gain >>> given by merging on this hardware. >> >> It's not always going to be true. On SATA the command overhead is fairly >> low, but on other hardware that may not be the case. Unless you are CPU >> bound by your IO device, then merging will always be beneficial. I'm a >> little behind on emails after my vacation, Jeff what numbers did you >> generate and on what hardware? > > ...and on what workload? "the workload that mostly benefit of queue > merging" is highly subjective, and likely does not cover most workloads > SSDs will see in the field. That, too. The queue merging it not exactly cheap, so perhaps we can work on making that work better as well. I've got some new hardware in the bag that'll do IOPS in the millions range, so I'll throw some tests at it too once I get it cabled up. -- Jens Axboe ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] cfq-iosched: NCQ SSDs do not need read queue merging 2010-01-11 13:18 ` Jeff Garzik 2010-01-11 13:24 ` Jens Axboe @ 2010-01-11 14:53 ` Corrado Zoccolo 2010-01-11 16:44 ` Vivek Goyal 1 sibling, 1 reply; 41+ messages in thread From: Corrado Zoccolo @ 2010-01-11 14:53 UTC (permalink / raw) To: Jeff Garzik Cc: Jens Axboe, Linux-Kernel, Jeff Moyer, Vivek Goyal, Shaohua Li, Gui Jianfeng On Mon, Jan 11, 2010 at 2:18 PM, Jeff Garzik <jeff@garzik.org> wrote: > On 01/11/2010 08:13 AM, Jens Axboe wrote: >> >> On Mon, Jan 11 2010, Corrado Zoccolo wrote: >>> >>> On Mon, Jan 11, 2010 at 12:25 PM, Jeff Garzik<jeff@garzik.org> wrote: >>>> >>>> On 01/10/2010 04:04 PM, Corrado Zoccolo wrote: >>>>> >>>>> NCQ SSDs' performances are not affected by >>>>> distance of read requests, so there is no point in having >>>>> overhead to merge such queues. >>>>> >>>>> Non-NCQ SSDs showed regression in some special cases, so >>>>> they are ruled out by this patch. >>>>> >>>>> This patch intentionally doesn't affect writes, so >>>>> it changes the queued[] field, to be indexed by >>>>> READ/WRITE instead of SYNC/ASYNC, and only compute proximity >>>>> for queues with WRITE requests. >>>>> >>>>> Signed-off-by: Corrado Zoccolo<czoccolo@gmail.com> >>>> >>>> That's not really true. Overhead always increases as the total number >>>> of >>>> ATA commands issued increases. >>> >>> Jeff Moyer tested the patch on the workload that mostly benefit of >>> queue merging, and found that >>> the performance was improved by the patch. >>> So removing the CPU overhead helps much more than the marginal gain >>> given by merging on this hardware. >> >> It's not always going to be true. On SATA the command overhead is fairly >> low, but on other hardware that may not be the case. Unless you are CPU >> bound by your IO device, then merging will always be beneficial. I'm a >> little behind on emails after my vacation, Jeff what numbers did you >> generate and on what hardware? > > ...and on what workload? "the workload that mostly benefit of queue > merging" is highly subjective, and likely does not cover most workloads SSDs > will see in the field. Hi Jeff, exactly. The workloads that benefits from queue merging are the ones in which a sequential read is actually splitt, and carried out by different processes in different I/O context, each sending requests with strides. This is clearly not the best way of doing sequential access (I would happily declare those programs as buggy). CFQ has code that merges queues in this case. I'm disabling the READ part for NCQ SSDs, since, as Jeff measured, the code overhead outweight the gain from merging (if any). As you said, most workloads don't benefit from queue merging. On those ones, the patch just removes an overhead. Thanks, Corrado > Jeff > > > > ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] cfq-iosched: NCQ SSDs do not need read queue merging 2010-01-11 14:53 ` Corrado Zoccolo @ 2010-01-11 16:44 ` Vivek Goyal 2010-01-11 17:00 ` Corrado Zoccolo 0 siblings, 1 reply; 41+ messages in thread From: Vivek Goyal @ 2010-01-11 16:44 UTC (permalink / raw) To: Corrado Zoccolo Cc: Jeff Garzik, Jens Axboe, Linux-Kernel, Jeff Moyer, Shaohua Li, Gui Jianfeng On Mon, Jan 11, 2010 at 03:53:00PM +0100, Corrado Zoccolo wrote: > On Mon, Jan 11, 2010 at 2:18 PM, Jeff Garzik <jeff@garzik.org> wrote: > > On 01/11/2010 08:13 AM, Jens Axboe wrote: > >> > >> On Mon, Jan 11 2010, Corrado Zoccolo wrote: > >>> > >>> On Mon, Jan 11, 2010 at 12:25 PM, Jeff Garzik<jeff@garzik.org> wrote: > >>>> > >>>> On 01/10/2010 04:04 PM, Corrado Zoccolo wrote: > >>>>> > >>>>> NCQ SSDs' performances are not affected by > >>>>> distance of read requests, so there is no point in having > >>>>> overhead to merge such queues. > >>>>> > >>>>> Non-NCQ SSDs showed regression in some special cases, so > >>>>> they are ruled out by this patch. > >>>>> > >>>>> This patch intentionally doesn't affect writes, so > >>>>> it changes the queued[] field, to be indexed by > >>>>> READ/WRITE instead of SYNC/ASYNC, and only compute proximity > >>>>> for queues with WRITE requests. > >>>>> > >>>>> Signed-off-by: Corrado Zoccolo<czoccolo@gmail.com> > >>>> > >>>> That's not really true. Overhead always increases as the total number > >>>> of > >>>> ATA commands issued increases. > >>> > >>> Jeff Moyer tested the patch on the workload that mostly benefit of > >>> queue merging, and found that > >>> the performance was improved by the patch. > >>> So removing the CPU overhead helps much more than the marginal gain > >>> given by merging on this hardware. > >> > >> It's not always going to be true. On SATA the command overhead is fairly > >> low, but on other hardware that may not be the case. Unless you are CPU > >> bound by your IO device, then merging will always be beneficial. I'm a > >> little behind on emails after my vacation, Jeff what numbers did you > >> generate and on what hardware? > > > > ...and on what workload? "the workload that mostly benefit of queue > > merging" is highly subjective, and likely does not cover most workloads SSDs > > will see in the field. > Hi Jeff, > exactly. > The workloads that benefits from queue merging are the ones in which a > sequential > read is actually splitt, and carried out by different processes in > different I/O context, each > sending requests with strides. This is clearly not the best way of > doing sequential access > (I would happily declare those programs as buggy). > CFQ has code that merges queues in this case. I'm disabling the READ > part for NCQ SSDs, > since, as Jeff measured, the code overhead outweight the gain from > merging (if any). Hi Corrado, In Jeff's test case of running read-test2, I am not even sure if any merging between the queues took place or not as on NCQ SSD, we are driving deeper queue depths and unless read-test2 is creating more than 32 threads, there might not be any merging taking place at all. We also don't have any data/numbers what kind of cpu savings does this patch bring in. Vivek > > As you said, most workloads don't benefit from queue merging. On those > ones, the patch > just removes an overhead. > > Thanks, > Corrado > > > Jeff > > > > > > > > ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] cfq-iosched: NCQ SSDs do not need read queue merging 2010-01-11 16:44 ` Vivek Goyal @ 2010-01-11 17:00 ` Corrado Zoccolo 2010-01-11 17:07 ` Vivek Goyal 2010-01-11 17:11 ` Vivek Goyal 0 siblings, 2 replies; 41+ messages in thread From: Corrado Zoccolo @ 2010-01-11 17:00 UTC (permalink / raw) To: Vivek Goyal Cc: Jeff Garzik, Jens Axboe, Linux-Kernel, Jeff Moyer, Shaohua Li, Gui Jianfeng On Mon, Jan 11, 2010 at 5:44 PM, Vivek Goyal <vgoyal@redhat.com> wrote: > On Mon, Jan 11, 2010 at 03:53:00PM +0100, Corrado Zoccolo wrote: >> On Mon, Jan 11, 2010 at 2:18 PM, Jeff Garzik <jeff@garzik.org> wrote: >> > On 01/11/2010 08:13 AM, Jens Axboe wrote: >> >> >> >> On Mon, Jan 11 2010, Corrado Zoccolo wrote: >> >>> >> >>> On Mon, Jan 11, 2010 at 12:25 PM, Jeff Garzik<jeff@garzik.org> wrote: >> >>>> >> >>>> On 01/10/2010 04:04 PM, Corrado Zoccolo wrote: >> >>>>> >> >>>>> NCQ SSDs' performances are not affected by >> >>>>> distance of read requests, so there is no point in having >> >>>>> overhead to merge such queues. >> >>>>> >> >>>>> Non-NCQ SSDs showed regression in some special cases, so >> >>>>> they are ruled out by this patch. >> >>>>> >> >>>>> This patch intentionally doesn't affect writes, so >> >>>>> it changes the queued[] field, to be indexed by >> >>>>> READ/WRITE instead of SYNC/ASYNC, and only compute proximity >> >>>>> for queues with WRITE requests. >> >>>>> >> >>>>> Signed-off-by: Corrado Zoccolo<czoccolo@gmail.com> >> >>>> >> >>>> That's not really true. Overhead always increases as the total number >> >>>> of >> >>>> ATA commands issued increases. >> >>> >> >>> Jeff Moyer tested the patch on the workload that mostly benefit of >> >>> queue merging, and found that >> >>> the performance was improved by the patch. >> >>> So removing the CPU overhead helps much more than the marginal gain >> >>> given by merging on this hardware. >> >> >> >> It's not always going to be true. On SATA the command overhead is fairly >> >> low, but on other hardware that may not be the case. Unless you are CPU >> >> bound by your IO device, then merging will always be beneficial. I'm a >> >> little behind on emails after my vacation, Jeff what numbers did you >> >> generate and on what hardware? >> > >> > ...and on what workload? "the workload that mostly benefit of queue >> > merging" is highly subjective, and likely does not cover most workloads SSDs >> > will see in the field. >> Hi Jeff, >> exactly. >> The workloads that benefits from queue merging are the ones in which a >> sequential >> read is actually splitt, and carried out by different processes in >> different I/O context, each >> sending requests with strides. This is clearly not the best way of >> doing sequential access >> (I would happily declare those programs as buggy). >> CFQ has code that merges queues in this case. I'm disabling the READ >> part for NCQ SSDs, >> since, as Jeff measured, the code overhead outweight the gain from >> merging (if any). > > Hi Corrado, > > In Jeff's test case of running read-test2, I am not even sure if any > merging between the queues took place or not as on NCQ SSD, we are driving > deeper queue depths and unless read-test2 is creating more than 32 > threads, there might not be any merging taking place at all. Jeff's test was modeled after real use cases: widely used, legacy programs like dump. Since we often said that splitting the sequential stream in multiple threads was not the correct approach, and we did introduce the change in the kernel just to support those programs (not to encourage writing more of this league), we can assume that if they do not drive deeper queues, no one will. So the overhead is just overhead, and will never give any benefit. I therefore want to remove it, since for SSD it matters. > > We also don't have any data/numbers what kind of cpu savings does this > patch bring in. Jeff's test showed larger bandwidth with merge disabled, so it implies some saving is present. Thanks, Corrado > > Vivek > >> >> As you said, most workloads don't benefit from queue merging. On those >> ones, the patch >> just removes an overhead. >> >> Thanks, >> Corrado >> >> > Jeff >> > >> > >> > >> > ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] cfq-iosched: NCQ SSDs do not need read queue merging 2010-01-11 17:00 ` Corrado Zoccolo @ 2010-01-11 17:07 ` Vivek Goyal 2010-01-11 19:05 ` Corrado Zoccolo 2010-01-11 17:11 ` Vivek Goyal 1 sibling, 1 reply; 41+ messages in thread From: Vivek Goyal @ 2010-01-11 17:07 UTC (permalink / raw) To: Corrado Zoccolo Cc: Jeff Garzik, Jens Axboe, Linux-Kernel, Jeff Moyer, Shaohua Li, Gui Jianfeng On Mon, Jan 11, 2010 at 06:00:51PM +0100, Corrado Zoccolo wrote: > On Mon, Jan 11, 2010 at 5:44 PM, Vivek Goyal <vgoyal@redhat.com> wrote: > > On Mon, Jan 11, 2010 at 03:53:00PM +0100, Corrado Zoccolo wrote: > >> On Mon, Jan 11, 2010 at 2:18 PM, Jeff Garzik <jeff@garzik.org> wrote: > >> > On 01/11/2010 08:13 AM, Jens Axboe wrote: > >> >> > >> >> On Mon, Jan 11 2010, Corrado Zoccolo wrote: > >> >>> > >> >>> On Mon, Jan 11, 2010 at 12:25 PM, Jeff Garzik<jeff@garzik.org> wrote: > >> >>>> > >> >>>> On 01/10/2010 04:04 PM, Corrado Zoccolo wrote: > >> >>>>> > >> >>>>> NCQ SSDs' performances are not affected by > >> >>>>> distance of read requests, so there is no point in having > >> >>>>> overhead to merge such queues. > >> >>>>> > >> >>>>> Non-NCQ SSDs showed regression in some special cases, so > >> >>>>> they are ruled out by this patch. > >> >>>>> > >> >>>>> This patch intentionally doesn't affect writes, so > >> >>>>> it changes the queued[] field, to be indexed by > >> >>>>> READ/WRITE instead of SYNC/ASYNC, and only compute proximity > >> >>>>> for queues with WRITE requests. > >> >>>>> > >> >>>>> Signed-off-by: Corrado Zoccolo<czoccolo@gmail.com> > >> >>>> > >> >>>> That's not really true. Overhead always increases as the total number > >> >>>> of > >> >>>> ATA commands issued increases. > >> >>> > >> >>> Jeff Moyer tested the patch on the workload that mostly benefit of > >> >>> queue merging, and found that > >> >>> the performance was improved by the patch. > >> >>> So removing the CPU overhead helps much more than the marginal gain > >> >>> given by merging on this hardware. > >> >> > >> >> It's not always going to be true. On SATA the command overhead is fairly > >> >> low, but on other hardware that may not be the case. Unless you are CPU > >> >> bound by your IO device, then merging will always be beneficial. I'm a > >> >> little behind on emails after my vacation, Jeff what numbers did you > >> >> generate and on what hardware? > >> > > >> > ...and on what workload? "the workload that mostly benefit of queue > >> > merging" is highly subjective, and likely does not cover most workloads SSDs > >> > will see in the field. > >> Hi Jeff, > >> exactly. > >> The workloads that benefits from queue merging are the ones in which a > >> sequential > >> read is actually splitt, and carried out by different processes in > >> different I/O context, each > >> sending requests with strides. This is clearly not the best way of > >> doing sequential access > >> (I would happily declare those programs as buggy). > >> CFQ has code that merges queues in this case. I'm disabling the READ > >> part for NCQ SSDs, > >> since, as Jeff measured, the code overhead outweight the gain from > >> merging (if any). > > > > Hi Corrado, > > > > In Jeff's test case of running read-test2, I am not even sure if any > > merging between the queues took place or not as on NCQ SSD, we are driving > > deeper queue depths and unless read-test2 is creating more than 32 > > threads, there might not be any merging taking place at all. > > Jeff's test was modeled after real use cases: widely used, legacy > programs like dump. > Since we often said that splitting the sequential stream in multiple > threads was not the > correct approach, and we did introduce the change in the kernel just > to support those > programs (not to encourage writing more of this league), we can assume > that if they > do not drive deeper queues, no one will. So the overhead is just > overhead, and will never > give any benefit. > > I therefore want to remove it, since for SSD it matters. > > > > We also don't have any data/numbers what kind of cpu savings does this > > patch bring in. > > Jeff's test showed larger bandwidth with merge disabled, so it implies > some saving is present. Following is what Jeff had posted. ==> vanilla <== Mean: 163.22728 Population Std. Dev.: 0.55401 ==> patched <== Mean: 162.91558 Population Std. Dev.: 1.08612 I see that with patched kernel(your patches), "Mean" BW of 50 runs has gone down slightly. So where is the improvement in terms of BW? (Are you referring to higher standard deviation, that means some of the runs observed higher BW and concluding something from that?) Vivek > > Thanks, > Corrado > > > > > Vivek > > > >> > >> As you said, most workloads don't benefit from queue merging. On those > >> ones, the patch > >> just removes an overhead. > >> > >> Thanks, > >> Corrado > >> > >> > Jeff > >> > > >> > > >> > > >> > ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] cfq-iosched: NCQ SSDs do not need read queue merging 2010-01-11 17:07 ` Vivek Goyal @ 2010-01-11 19:05 ` Corrado Zoccolo 0 siblings, 0 replies; 41+ messages in thread From: Corrado Zoccolo @ 2010-01-11 19:05 UTC (permalink / raw) To: Vivek Goyal Cc: Jeff Garzik, Jens Axboe, Linux-Kernel, Jeff Moyer, Shaohua Li, Gui Jianfeng On Mon, Jan 11, 2010 at 6:07 PM, Vivek Goyal <vgoyal@redhat.com> wrote: > On Mon, Jan 11, 2010 at 06:00:51PM +0100, Corrado Zoccolo wrote: >> On Mon, Jan 11, 2010 at 5:44 PM, Vivek Goyal <vgoyal@redhat.com> wrote: >> > On Mon, Jan 11, 2010 at 03:53:00PM +0100, Corrado Zoccolo wrote: >> >> On Mon, Jan 11, 2010 at 2:18 PM, Jeff Garzik <jeff@garzik.org> wrote: >> >> > On 01/11/2010 08:13 AM, Jens Axboe wrote: >> >> >> >> >> >> On Mon, Jan 11 2010, Corrado Zoccolo wrote: >> >> >>> >> >> >>> On Mon, Jan 11, 2010 at 12:25 PM, Jeff Garzik<jeff@garzik.org> wrote: >> >> >>>> >> >> >>>> On 01/10/2010 04:04 PM, Corrado Zoccolo wrote: >> >> >>>>> >> >> >>>>> NCQ SSDs' performances are not affected by >> >> >>>>> distance of read requests, so there is no point in having >> >> >>>>> overhead to merge such queues. >> >> >>>>> >> >> >>>>> Non-NCQ SSDs showed regression in some special cases, so >> >> >>>>> they are ruled out by this patch. >> >> >>>>> >> >> >>>>> This patch intentionally doesn't affect writes, so >> >> >>>>> it changes the queued[] field, to be indexed by >> >> >>>>> READ/WRITE instead of SYNC/ASYNC, and only compute proximity >> >> >>>>> for queues with WRITE requests. >> >> >>>>> >> >> >>>>> Signed-off-by: Corrado Zoccolo<czoccolo@gmail.com> >> >> >>>> >> >> >>>> That's not really true. Overhead always increases as the total number >> >> >>>> of >> >> >>>> ATA commands issued increases. >> >> >>> >> >> >>> Jeff Moyer tested the patch on the workload that mostly benefit of >> >> >>> queue merging, and found that >> >> >>> the performance was improved by the patch. >> >> >>> So removing the CPU overhead helps much more than the marginal gain >> >> >>> given by merging on this hardware. >> >> >> >> >> >> It's not always going to be true. On SATA the command overhead is fairly >> >> >> low, but on other hardware that may not be the case. Unless you are CPU >> >> >> bound by your IO device, then merging will always be beneficial. I'm a >> >> >> little behind on emails after my vacation, Jeff what numbers did you >> >> >> generate and on what hardware? >> >> > >> >> > ...and on what workload? "the workload that mostly benefit of queue >> >> > merging" is highly subjective, and likely does not cover most workloads SSDs >> >> > will see in the field. >> >> Hi Jeff, >> >> exactly. >> >> The workloads that benefits from queue merging are the ones in which a >> >> sequential >> >> read is actually splitt, and carried out by different processes in >> >> different I/O context, each >> >> sending requests with strides. This is clearly not the best way of >> >> doing sequential access >> >> (I would happily declare those programs as buggy). >> >> CFQ has code that merges queues in this case. I'm disabling the READ >> >> part for NCQ SSDs, >> >> since, as Jeff measured, the code overhead outweight the gain from >> >> merging (if any). >> > >> > Hi Corrado, >> > >> > In Jeff's test case of running read-test2, I am not even sure if any >> > merging between the queues took place or not as on NCQ SSD, we are driving >> > deeper queue depths and unless read-test2 is creating more than 32 >> > threads, there might not be any merging taking place at all. >> >> Jeff's test was modeled after real use cases: widely used, legacy >> programs like dump. >> Since we often said that splitting the sequential stream in multiple >> threads was not the >> correct approach, and we did introduce the change in the kernel just >> to support those >> programs (not to encourage writing more of this league), we can assume >> that if they >> do not drive deeper queues, no one will. So the overhead is just >> overhead, and will never >> give any benefit. >> >> I therefore want to remove it, since for SSD it matters. >> > >> > We also don't have any data/numbers what kind of cpu savings does this >> > patch bring in. >> >> Jeff's test showed larger bandwidth with merge disabled, so it implies >> some saving is present. > > Following is what Jeff had posted. > > ==> vanilla <== > Mean: 163.22728 > Population Std. Dev.: 0.55401 > > ==> patched <== > Mean: 162.91558 > Population Std. Dev.: 1.08612 > > > I see that with patched kernel(your patches), "Mean" BW of 50 runs has gone > down slightly. So where is the improvement in terms of BW? (Are you referring > to higher standard deviation, that means some of the runs observed higher BW > and concluding something from that?) Sorry, I wrongly remembered the numbers where the opposite. Corrado > > Vivek > >> >> Thanks, >> Corrado >> >> > >> > Vivek >> > >> >> >> >> As you said, most workloads don't benefit from queue merging. On those >> >> ones, the patch >> >> just removes an overhead. >> >> >> >> Thanks, >> >> Corrado >> >> >> >> > Jeff >> >> > >> >> > >> >> > >> >> > ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] cfq-iosched: NCQ SSDs do not need read queue merging 2010-01-11 17:00 ` Corrado Zoccolo 2010-01-11 17:07 ` Vivek Goyal @ 2010-01-11 17:11 ` Vivek Goyal 2010-01-11 19:09 ` Corrado Zoccolo 1 sibling, 1 reply; 41+ messages in thread From: Vivek Goyal @ 2010-01-11 17:11 UTC (permalink / raw) To: Corrado Zoccolo Cc: Jeff Garzik, Jens Axboe, Linux-Kernel, Jeff Moyer, Shaohua Li, Gui Jianfeng On Mon, Jan 11, 2010 at 06:00:51PM +0100, Corrado Zoccolo wrote: > On Mon, Jan 11, 2010 at 5:44 PM, Vivek Goyal <vgoyal@redhat.com> wrote: > > On Mon, Jan 11, 2010 at 03:53:00PM +0100, Corrado Zoccolo wrote: > >> On Mon, Jan 11, 2010 at 2:18 PM, Jeff Garzik <jeff@garzik.org> wrote: > >> > On 01/11/2010 08:13 AM, Jens Axboe wrote: > >> >> > >> >> On Mon, Jan 11 2010, Corrado Zoccolo wrote: > >> >>> > >> >>> On Mon, Jan 11, 2010 at 12:25 PM, Jeff Garzik<jeff@garzik.org> wrote: > >> >>>> > >> >>>> On 01/10/2010 04:04 PM, Corrado Zoccolo wrote: > >> >>>>> > >> >>>>> NCQ SSDs' performances are not affected by > >> >>>>> distance of read requests, so there is no point in having > >> >>>>> overhead to merge such queues. > >> >>>>> > >> >>>>> Non-NCQ SSDs showed regression in some special cases, so > >> >>>>> they are ruled out by this patch. > >> >>>>> > >> >>>>> This patch intentionally doesn't affect writes, so > >> >>>>> it changes the queued[] field, to be indexed by > >> >>>>> READ/WRITE instead of SYNC/ASYNC, and only compute proximity > >> >>>>> for queues with WRITE requests. > >> >>>>> > >> >>>>> Signed-off-by: Corrado Zoccolo<czoccolo@gmail.com> > >> >>>> > >> >>>> That's not really true. Overhead always increases as the total number > >> >>>> of > >> >>>> ATA commands issued increases. > >> >>> > >> >>> Jeff Moyer tested the patch on the workload that mostly benefit of > >> >>> queue merging, and found that > >> >>> the performance was improved by the patch. > >> >>> So removing the CPU overhead helps much more than the marginal gain > >> >>> given by merging on this hardware. > >> >> > >> >> It's not always going to be true. On SATA the command overhead is fairly > >> >> low, but on other hardware that may not be the case. Unless you are CPU > >> >> bound by your IO device, then merging will always be beneficial. I'm a > >> >> little behind on emails after my vacation, Jeff what numbers did you > >> >> generate and on what hardware? > >> > > >> > ...and on what workload? "the workload that mostly benefit of queue > >> > merging" is highly subjective, and likely does not cover most workloads SSDs > >> > will see in the field. > >> Hi Jeff, > >> exactly. > >> The workloads that benefits from queue merging are the ones in which a > >> sequential > >> read is actually splitt, and carried out by different processes in > >> different I/O context, each > >> sending requests with strides. This is clearly not the best way of > >> doing sequential access > >> (I would happily declare those programs as buggy). > >> CFQ has code that merges queues in this case. I'm disabling the READ > >> part for NCQ SSDs, > >> since, as Jeff measured, the code overhead outweight the gain from > >> merging (if any). > > > > Hi Corrado, > > > > In Jeff's test case of running read-test2, I am not even sure if any > > merging between the queues took place or not as on NCQ SSD, we are driving > > deeper queue depths and unless read-test2 is creating more than 32 > > threads, there might not be any merging taking place at all. > > Jeff's test was modeled after real use cases: widely used, legacy > programs like dump. > Since we often said that splitting the sequential stream in multiple > threads was not the > correct approach, and we did introduce the change in the kernel just > to support those > programs (not to encourage writing more of this league), we can assume > that if they > do not drive deeper queues, no one will. So the overhead is just > overhead, and will never > give any benefit. Two things come to mind. - Even if dump/read-test2 is not driving deeper queue depths, but other competing programs might be driving deeper queue depths, which can give queue merging opportunity in case of dump program on NCQ SSD. - If we are thinking that on NCQ SSD we practically don't have queue merging opportunity then there is no point in keeping it enabled for WRITES also? Vivek > > I therefore want to remove it, since for SSD it matters. > > > > We also don't have any data/numbers what kind of cpu savings does this > > patch bring in. > > Jeff's test showed larger bandwidth with merge disabled, so it implies > some saving is present. > > Thanks, > Corrado > > > > > Vivek > > > >> > >> As you said, most workloads don't benefit from queue merging. On those > >> ones, the patch > >> just removes an overhead. > >> > >> Thanks, > >> Corrado > >> > >> > Jeff > >> > > >> > > >> > > >> > ^ permalink raw reply [flat|nested] 41+ messages in thread
* Re: [PATCH] cfq-iosched: NCQ SSDs do not need read queue merging 2010-01-11 17:11 ` Vivek Goyal @ 2010-01-11 19:09 ` Corrado Zoccolo 0 siblings, 0 replies; 41+ messages in thread From: Corrado Zoccolo @ 2010-01-11 19:09 UTC (permalink / raw) To: Vivek Goyal Cc: Jeff Garzik, Jens Axboe, Linux-Kernel, Jeff Moyer, Shaohua Li, Gui Jianfeng On Mon, Jan 11, 2010 at 6:11 PM, Vivek Goyal <vgoyal@redhat.com> wrote: > On Mon, Jan 11, 2010 at 06:00:51PM +0100, Corrado Zoccolo wrote: >> On Mon, Jan 11, 2010 at 5:44 PM, Vivek Goyal <vgoyal@redhat.com> wrote: >> > On Mon, Jan 11, 2010 at 03:53:00PM +0100, Corrado Zoccolo wrote: >> >> On Mon, Jan 11, 2010 at 2:18 PM, Jeff Garzik <jeff@garzik.org> wrote: >> >> > On 01/11/2010 08:13 AM, Jens Axboe wrote: >> >> >> >> >> >> On Mon, Jan 11 2010, Corrado Zoccolo wrote: >> >> >>> >> >> >>> On Mon, Jan 11, 2010 at 12:25 PM, Jeff Garzik<jeff@garzik.org> wrote: >> >> >>>> >> >> >>>> On 01/10/2010 04:04 PM, Corrado Zoccolo wrote: >> >> >>>>> >> >> >>>>> NCQ SSDs' performances are not affected by >> >> >>>>> distance of read requests, so there is no point in having >> >> >>>>> overhead to merge such queues. >> >> >>>>> >> >> >>>>> Non-NCQ SSDs showed regression in some special cases, so >> >> >>>>> they are ruled out by this patch. >> >> >>>>> >> >> >>>>> This patch intentionally doesn't affect writes, so >> >> >>>>> it changes the queued[] field, to be indexed by >> >> >>>>> READ/WRITE instead of SYNC/ASYNC, and only compute proximity >> >> >>>>> for queues with WRITE requests. >> >> >>>>> >> >> >>>>> Signed-off-by: Corrado Zoccolo<czoccolo@gmail.com> >> >> >>>> >> >> >>>> That's not really true. Overhead always increases as the total number >> >> >>>> of >> >> >>>> ATA commands issued increases. >> >> >>> >> >> >>> Jeff Moyer tested the patch on the workload that mostly benefit of >> >> >>> queue merging, and found that >> >> >>> the performance was improved by the patch. >> >> >>> So removing the CPU overhead helps much more than the marginal gain >> >> >>> given by merging on this hardware. >> >> >> >> >> >> It's not always going to be true. On SATA the command overhead is fairly >> >> >> low, but on other hardware that may not be the case. Unless you are CPU >> >> >> bound by your IO device, then merging will always be beneficial. I'm a >> >> >> little behind on emails after my vacation, Jeff what numbers did you >> >> >> generate and on what hardware? >> >> > >> >> > ...and on what workload? "the workload that mostly benefit of queue >> >> > merging" is highly subjective, and likely does not cover most workloads SSDs >> >> > will see in the field. >> >> Hi Jeff, >> >> exactly. >> >> The workloads that benefits from queue merging are the ones in which a >> >> sequential >> >> read is actually splitt, and carried out by different processes in >> >> different I/O context, each >> >> sending requests with strides. This is clearly not the best way of >> >> doing sequential access >> >> (I would happily declare those programs as buggy). >> >> CFQ has code that merges queues in this case. I'm disabling the READ >> >> part for NCQ SSDs, >> >> since, as Jeff measured, the code overhead outweight the gain from >> >> merging (if any). >> > >> > Hi Corrado, >> > >> > In Jeff's test case of running read-test2, I am not even sure if any >> > merging between the queues took place or not as on NCQ SSD, we are driving >> > deeper queue depths and unless read-test2 is creating more than 32 >> > threads, there might not be any merging taking place at all. >> >> Jeff's test was modeled after real use cases: widely used, legacy >> programs like dump. >> Since we often said that splitting the sequential stream in multiple >> threads was not the >> correct approach, and we did introduce the change in the kernel just >> to support those >> programs (not to encourage writing more of this league), we can assume >> that if they >> do not drive deeper queues, no one will. So the overhead is just >> overhead, and will never >> give any benefit. > > Two things come to mind. > > - Even if dump/read-test2 is not driving deeper queue depths, but other > competing programs might be driving deeper queue depths, which can give > queue merging opportunity in case of dump program on NCQ SSD. Correct. We need a better test to determine if those merges can happen. > > - If we are thinking that on NCQ SSD we practically don't have queue > merging opportunity then there is no point in keeping it enabled for > WRITES also? Probably. Needs test here, too. Corrado > > Vivek > >> >> I therefore want to remove it, since for SSD it matters. >> > >> > We also don't have any data/numbers what kind of cpu savings does this >> > patch bring in. >> >> Jeff's test showed larger bandwidth with merge disabled, so it implies >> some saving is present. >> >> Thanks, >> Corrado >> >> > >> > Vivek >> > >> >> >> >> As you said, most workloads don't benefit from queue merging. On those >> >> ones, the patch >> >> just removes an overhead. >> >> >> >> Thanks, >> >> Corrado >> >> ^ permalink raw reply [flat|nested] 41+ messages in thread
end of thread, other threads:[~2010-01-11 19:09 UTC | newest] Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2009-12-30 12:10 [PATCH] cfq-iosched: non-rot devices do not need queue merging Corrado Zoccolo 2009-12-30 18:45 ` Jens Axboe 2009-12-30 20:31 ` Corrado Zoccolo 2009-12-30 21:11 ` Jens Axboe 2009-12-30 21:21 ` Corrado Zoccolo 2009-12-30 21:34 ` Jens Axboe 2009-12-30 22:22 ` [PATCH] cfq-iosched: non-rot devices do not need read " Corrado Zoccolo 2010-01-04 14:47 ` Vivek Goyal 2010-01-04 16:36 ` Corrado Zoccolo 2010-01-04 16:51 ` Jeff Moyer 2010-01-04 18:32 ` Vivek Goyal 2010-01-04 18:37 ` Corrado Zoccolo 2010-01-04 18:51 ` Vivek Goyal 2010-01-04 19:04 ` Jeff Moyer 2010-01-04 20:37 ` Corrado Zoccolo 2010-01-05 14:58 ` Jeff Moyer 2010-01-05 15:13 ` Vivek Goyal 2010-01-05 21:19 ` Jeff Moyer 2010-01-05 21:48 ` Corrado Zoccolo 2010-01-07 10:56 ` Kirill Afonshin 2010-01-07 13:38 ` Corrado Zoccolo 2010-01-07 14:36 ` Vivek Goyal 2010-01-07 17:00 ` Corrado Zoccolo 2010-01-07 18:37 ` Vivek Goyal 2010-01-07 20:16 ` Corrado Zoccolo 2010-01-08 18:53 ` Vivek Goyal 2010-01-10 12:55 ` Corrado Zoccolo 2010-01-10 21:04 ` [PATCH] cfq-iosched: NCQ SSDs " Corrado Zoccolo 2010-01-10 21:08 ` Corrado Zoccolo 2010-01-11 11:25 ` Jeff Garzik 2010-01-11 12:26 ` Corrado Zoccolo 2010-01-11 13:13 ` Jens Axboe 2010-01-11 13:18 ` Jeff Garzik 2010-01-11 13:24 ` Jens Axboe 2010-01-11 14:53 ` Corrado Zoccolo 2010-01-11 16:44 ` Vivek Goyal 2010-01-11 17:00 ` Corrado Zoccolo 2010-01-11 17:07 ` Vivek Goyal 2010-01-11 19:05 ` Corrado Zoccolo 2010-01-11 17:11 ` Vivek Goyal 2010-01-11 19:09 ` Corrado Zoccolo
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.