* [PATCH BUGFIX/IMPROVEMENT V2 0/3] three bfq fixes restoring service guarantees with random sync writes in bg @ 2017-08-31 6:46 Paolo Valente 2017-08-31 6:46 ` [PATCH BUGFIX/IMPROVEMENT V2 1/3] block, bfq: make lookup_next_entity push up vtime on expirations Paolo Valente ` (4 more replies) 0 siblings, 5 replies; 16+ messages in thread From: Paolo Valente @ 2017-08-31 6:46 UTC (permalink / raw) To: Jens Axboe Cc: linux-block, linux-kernel, ulf.hansson, broonie, mgorman, lee.tibbert, oleksandr, Paolo Valente [SECOND TAKE, with just the name of one of the tester fixed] Hi, while testing the read-write unfairness issues reported by Mel, I found BFQ failing to guarantee good responsiveness against heavy random sync writes in the background, i.e., multiple writers doing random writes and systematic fdatasync [1]. The failure was caused by three related bugs, because of which BFQ failed to guarantee to high-weight processes the expected fraction of the throughput. The three patches in this series fix these bugs. These fixes restore the usual BFQ service guarantees (and thus optimal responsiveness too), against the above background workload and, probably, against other similar workloads. Thanks, Paolo [1] https://lkml.org/lkml/2017/8/9/957 Paolo Valente (3): block, bfq: make lookup_next_entity push up vtime on expirations block, bfq: remove direct switch to an entity in higher class block, bfq: guarantee update_next_in_service always returns an eligible entity block/bfq-iosched.c | 4 +-- block/bfq-iosched.h | 4 +-- block/bfq-wf2q.c | 91 ++++++++++++++++++++++++++++++++--------------------- 3 files changed, 60 insertions(+), 39 deletions(-) -- 2.10.0 ^ permalink raw reply [flat|nested] 16+ messages in thread
* [PATCH BUGFIX/IMPROVEMENT V2 1/3] block, bfq: make lookup_next_entity push up vtime on expirations 2017-08-31 6:46 [PATCH BUGFIX/IMPROVEMENT V2 0/3] three bfq fixes restoring service guarantees with random sync writes in bg Paolo Valente @ 2017-08-31 6:46 ` Paolo Valente 2017-08-31 6:46 ` [PATCH BUGFIX/IMPROVEMENT V2 2/3] block, bfq: remove direct switch to an entity in higher class Paolo Valente ` (3 subsequent siblings) 4 siblings, 0 replies; 16+ messages in thread From: Paolo Valente @ 2017-08-31 6:46 UTC (permalink / raw) To: Jens Axboe Cc: linux-block, linux-kernel, ulf.hansson, broonie, mgorman, lee.tibbert, oleksandr, Paolo Valente To provide a very smooth service, bfq starts to serve a bfq_queue only if the queue is 'eligible', i.e., if the same queue would have started to be served in the ideal, perfectly fair system that bfq simulates internally. This is obtained by associating each queue with a virtual start time, and by computing a special system virtual time quantity: a queue is eligible only if the system virtual time has reached the virtual start time of the queue. Finally, bfq guarantees that, when a new queue must be set in service, there is always at least one eligible entity for each active parent entity in the scheduler. To provide this guarantee, the function __bfq_lookup_next_entity pushes up, for each parent entity on which it is invoked, the system virtual time to the minimum among the virtual start times of the entities in the active tree for the parent entity (more precisely, the push up occurs if the system virtual time happens to be lower than all such virtual start times). There is however a circumstance in which __bfq_lookup_next_entity cannot push up the system virtual time for a parent entity, even if the system virtual time is lower than the virtual start times of all the child entities in the active tree. It happens if one of the child entities is in service. In fact, in such a case, there is already an eligible entity, the in-service one, even if it may not be not present in the active tree (because in-service entities may be removed from the active tree). Unfortunately, in the last re-design of the hierarchical-scheduling engine, the reset of the pointer to the in-service entity for a given parent entity--reset to be done as a consequence of the expiration of the in-service entity--always happens after the function __bfq_lookup_next_entity has been invoked. This causes the function to think that there is still an entity in service for the parent entity, and then that the system virtual time cannot be pushed up, even if actually such a no-more-in-service entity has already been properly reinserted into the active tree (or in some other tree if no more active). Yet, the system virtual time *had* to be pushed up, to be ready to correctly choose the next queue to serve. Because of the lack of this push up, bfq may wrongly set in service a queue that had been speculatively pre-computed as the possible next-in-service queue, but that would no more be the one to serve after the expiration and the reinsertion into the active trees of the previously in-service entities. This commit addresses this issue by making __bfq_lookup_next_entity properly push up the system virtual time if an expiration is occurring. Signed-off-by: Paolo Valente <paolo.valente@linaro.org> Tested-by: Lee Tibbert <lee.tibbert@gmail.com> Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name> --- block/bfq-iosched.c | 4 ++-- block/bfq-iosched.h | 4 ++-- block/bfq-wf2q.c | 58 +++++++++++++++++++++++++++++++++++++++-------------- 3 files changed, 47 insertions(+), 19 deletions(-) diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index 436b6ca..a10f147 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -720,7 +720,7 @@ static void bfq_updated_next_req(struct bfq_data *bfqd, entity->budget = new_budget; bfq_log_bfqq(bfqd, bfqq, "updated next rq: new budget %lu", new_budget); - bfq_requeue_bfqq(bfqd, bfqq); + bfq_requeue_bfqq(bfqd, bfqq, false); } } @@ -2563,7 +2563,7 @@ static void __bfq_bfqq_expire(struct bfq_data *bfqd, struct bfq_queue *bfqq) bfq_del_bfqq_busy(bfqd, bfqq, true); } else { - bfq_requeue_bfqq(bfqd, bfqq); + bfq_requeue_bfqq(bfqd, bfqq, true); /* * Resort priority tree of potential close cooperators. */ diff --git a/block/bfq-iosched.h b/block/bfq-iosched.h index 859f0a8..3e2659c 100644 --- a/block/bfq-iosched.h +++ b/block/bfq-iosched.h @@ -817,7 +817,6 @@ extern const int bfq_timeout; struct bfq_queue *bic_to_bfqq(struct bfq_io_cq *bic, bool is_sync); void bic_set_bfqq(struct bfq_io_cq *bic, struct bfq_queue *bfqq, bool is_sync); struct bfq_data *bic_to_bfqd(struct bfq_io_cq *bic); -void bfq_requeue_bfqq(struct bfq_data *bfqd, struct bfq_queue *bfqq); void bfq_pos_tree_add_move(struct bfq_data *bfqd, struct bfq_queue *bfqq); void bfq_weights_tree_add(struct bfq_data *bfqd, struct bfq_entity *entity, struct rb_root *root); @@ -917,7 +916,8 @@ void __bfq_bfqd_reset_in_service(struct bfq_data *bfqd); void bfq_deactivate_bfqq(struct bfq_data *bfqd, struct bfq_queue *bfqq, bool ins_into_idle_tree, bool expiration); void bfq_activate_bfqq(struct bfq_data *bfqd, struct bfq_queue *bfqq); -void bfq_requeue_bfqq(struct bfq_data *bfqd, struct bfq_queue *bfqq); +void bfq_requeue_bfqq(struct bfq_data *bfqd, struct bfq_queue *bfqq, + bool expiration); void bfq_del_bfqq_busy(struct bfq_data *bfqd, struct bfq_queue *bfqq, bool expiration); void bfq_add_bfqq_busy(struct bfq_data *bfqd, struct bfq_queue *bfqq); diff --git a/block/bfq-wf2q.c b/block/bfq-wf2q.c index 3183b39..138732e 100644 --- a/block/bfq-wf2q.c +++ b/block/bfq-wf2q.c @@ -44,7 +44,8 @@ static unsigned int bfq_class_idx(struct bfq_entity *entity) BFQ_DEFAULT_GRP_CLASS - 1; } -static struct bfq_entity *bfq_lookup_next_entity(struct bfq_sched_data *sd); +static struct bfq_entity *bfq_lookup_next_entity(struct bfq_sched_data *sd, + bool expiration); static bool bfq_update_parent_budget(struct bfq_entity *next_in_service); @@ -54,6 +55,8 @@ static bool bfq_update_parent_budget(struct bfq_entity *next_in_service); * @new_entity: if not NULL, pointer to the entity whose activation, * requeueing or repositionig triggered the invocation of * this function. + * @expiration: id true, this function is being invoked after the + * expiration of the in-service entity * * This function is called to update sd->next_in_service, which, in * its turn, may change as a consequence of the insertion or @@ -72,7 +75,8 @@ static bool bfq_update_parent_budget(struct bfq_entity *next_in_service); * entity. */ static bool bfq_update_next_in_service(struct bfq_sched_data *sd, - struct bfq_entity *new_entity) + struct bfq_entity *new_entity, + bool expiration) { struct bfq_entity *next_in_service = sd->next_in_service; bool parent_sched_may_change = false; @@ -130,7 +134,7 @@ static bool bfq_update_next_in_service(struct bfq_sched_data *sd, if (replace_next) next_in_service = new_entity; } else /* invoked because of a deactivation: lookup needed */ - next_in_service = bfq_lookup_next_entity(sd); + next_in_service = bfq_lookup_next_entity(sd, expiration); if (next_in_service) { parent_sched_may_change = !sd->next_in_service || @@ -1134,10 +1138,12 @@ static void __bfq_activate_requeue_entity(struct bfq_entity *entity, * @requeue: true if this is a requeue, which implies that bfqq is * being expired; thus ALL its ancestors stop being served and must * therefore be requeued + * @expiration: true if this function is being invoked in the expiration path + * of the in-service queue */ static void bfq_activate_requeue_entity(struct bfq_entity *entity, bool non_blocking_wait_rq, - bool requeue) + bool requeue, bool expiration) { struct bfq_sched_data *sd; @@ -1145,7 +1151,8 @@ static void bfq_activate_requeue_entity(struct bfq_entity *entity, sd = entity->sched_data; __bfq_activate_requeue_entity(entity, sd, non_blocking_wait_rq); - if (!bfq_update_next_in_service(sd, entity) && !requeue) + if (!bfq_update_next_in_service(sd, entity, expiration) && + !requeue) break; } } @@ -1201,6 +1208,8 @@ bool __bfq_deactivate_entity(struct bfq_entity *entity, bool ins_into_idle_tree) * bfq_deactivate_entity - deactivate an entity representing a bfq_queue. * @entity: the entity to deactivate. * @ins_into_idle_tree: true if the entity can be put into the idle tree + * @expiration: true if this function is being invoked in the expiration path + * of the in-service queue */ static void bfq_deactivate_entity(struct bfq_entity *entity, bool ins_into_idle_tree, @@ -1229,7 +1238,7 @@ static void bfq_deactivate_entity(struct bfq_entity *entity, * then, since entity has just been * deactivated, a new one must be found. */ - bfq_update_next_in_service(sd, NULL); + bfq_update_next_in_service(sd, NULL, expiration); if (sd->next_in_service || sd->in_service_entity) { /* @@ -1288,7 +1297,7 @@ static void bfq_deactivate_entity(struct bfq_entity *entity, __bfq_requeue_entity(entity); sd = entity->sched_data; - if (!bfq_update_next_in_service(sd, entity) && + if (!bfq_update_next_in_service(sd, entity, expiration) && !expiration) /* * next_in_service unchanged or not causing @@ -1423,12 +1432,14 @@ __bfq_lookup_next_entity(struct bfq_service_tree *st, bool in_service) /** * bfq_lookup_next_entity - return the first eligible entity in @sd. * @sd: the sched_data. + * @expiration: true if we are on the expiration path of the in-service queue * * This function is invoked when there has been a change in the trees - * for sd, and we need know what is the new next entity after this - * change. + * for sd, and we need to know what is the new next entity to serve + * after this change. */ -static struct bfq_entity *bfq_lookup_next_entity(struct bfq_sched_data *sd) +static struct bfq_entity *bfq_lookup_next_entity(struct bfq_sched_data *sd, + bool expiration) { struct bfq_service_tree *st = sd->service_tree; struct bfq_service_tree *idle_class_st = st + (BFQ_IOPRIO_CLASSES - 1); @@ -1455,8 +1466,24 @@ static struct bfq_entity *bfq_lookup_next_entity(struct bfq_sched_data *sd) * class, unless the idle class needs to be served. */ for (; class_idx < BFQ_IOPRIO_CLASSES; class_idx++) { + /* + * If expiration is true, then bfq_lookup_next_entity + * is being invoked as a part of the expiration path + * of the in-service queue. In this case, even if + * sd->in_service_entity is not NULL, + * sd->in_service_entiy at this point is actually not + * in service any more, and, if needed, has already + * been properly queued or requeued into the right + * tree. The reason why sd->in_service_entity is still + * not NULL here, even if expiration is true, is that + * sd->in_service_entiy is reset as a last step in the + * expiration path. So, if expiration is true, tell + * __bfq_lookup_next_entity that there is no + * sd->in_service_entity. + */ entity = __bfq_lookup_next_entity(st + class_idx, - sd->in_service_entity); + sd->in_service_entity && + !expiration); if (entity) break; @@ -1569,7 +1596,7 @@ struct bfq_queue *bfq_get_next_queue(struct bfq_data *bfqd) for_each_entity(entity) { struct bfq_sched_data *sd = entity->sched_data; - if (!bfq_update_next_in_service(sd, NULL)) + if (!bfq_update_next_in_service(sd, NULL, false)) break; } @@ -1617,16 +1644,17 @@ void bfq_activate_bfqq(struct bfq_data *bfqd, struct bfq_queue *bfqq) struct bfq_entity *entity = &bfqq->entity; bfq_activate_requeue_entity(entity, bfq_bfqq_non_blocking_wait_rq(bfqq), - false); + false, false); bfq_clear_bfqq_non_blocking_wait_rq(bfqq); } -void bfq_requeue_bfqq(struct bfq_data *bfqd, struct bfq_queue *bfqq) +void bfq_requeue_bfqq(struct bfq_data *bfqd, struct bfq_queue *bfqq, + bool expiration) { struct bfq_entity *entity = &bfqq->entity; bfq_activate_requeue_entity(entity, false, - bfqq == bfqd->in_service_queue); + bfqq == bfqd->in_service_queue, expiration); } /* -- 2.10.0 ^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH BUGFIX/IMPROVEMENT V2 2/3] block, bfq: remove direct switch to an entity in higher class 2017-08-31 6:46 [PATCH BUGFIX/IMPROVEMENT V2 0/3] three bfq fixes restoring service guarantees with random sync writes in bg Paolo Valente 2017-08-31 6:46 ` [PATCH BUGFIX/IMPROVEMENT V2 1/3] block, bfq: make lookup_next_entity push up vtime on expirations Paolo Valente @ 2017-08-31 6:46 ` Paolo Valente 2017-08-31 6:46 ` [PATCH BUGFIX/IMPROVEMENT V2 3/3] block, bfq: guarantee update_next_in_service always returns an eligible entity Paolo Valente ` (2 subsequent siblings) 4 siblings, 0 replies; 16+ messages in thread From: Paolo Valente @ 2017-08-31 6:46 UTC (permalink / raw) To: Jens Axboe Cc: linux-block, linux-kernel, ulf.hansson, broonie, mgorman, lee.tibbert, oleksandr, Paolo Valente If the function bfq_update_next_in_service is invoked as a consequence of the activation or requeueing of an entity, say E, and finds out that E belongs to a higher-priority class than that of the current next-in-service entity, then it sets next_in_service directly to E. But this may lead to anomalous schedules, because E may happen not be eligible for service, because its virtual start time is higher than the system virtual time for its service tree. This commit addresses this issue by simply removing this direct switch. Signed-off-by: Paolo Valente <paolo.valente@linaro.org> Tested-by: Lee Tibbert <lee.tibbert@gmail.com> Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name> --- block/bfq-wf2q.c | 19 +++++-------------- 1 file changed, 5 insertions(+), 14 deletions(-) diff --git a/block/bfq-wf2q.c b/block/bfq-wf2q.c index 138732e..eeaf326 100644 --- a/block/bfq-wf2q.c +++ b/block/bfq-wf2q.c @@ -86,9 +86,8 @@ static bool bfq_update_next_in_service(struct bfq_sched_data *sd, * or repositiong of an entity that does not coincide with * sd->next_in_service, then a full lookup in the active tree * can be avoided. In fact, it is enough to check whether the - * just-modified entity has a higher priority than - * sd->next_in_service, or, even if it has the same priority - * as sd->next_in_service, is eligible and has a lower virtual + * just-modified entity has the same priority as + * sd->next_in_service, is eligible and has a lower virtual * finish time than sd->next_in_service. If this compound * condition holds, then the new entity becomes the new * next_in_service. Otherwise no change is needed. @@ -104,9 +103,8 @@ static bool bfq_update_next_in_service(struct bfq_sched_data *sd, /* * If there is already a next_in_service candidate - * entity, then compare class priorities or timestamps - * to decide whether to replace sd->service_tree with - * new_entity. + * entity, then compare timestamps to decide whether + * to replace sd->service_tree with new_entity. */ if (next_in_service) { unsigned int new_entity_class_idx = @@ -114,10 +112,6 @@ static bool bfq_update_next_in_service(struct bfq_sched_data *sd, struct bfq_service_tree *st = sd->service_tree + new_entity_class_idx; - /* - * For efficiency, evaluate the most likely - * sub-condition first. - */ replace_next = (new_entity_class_idx == bfq_class_idx(next_in_service) @@ -125,10 +119,7 @@ static bool bfq_update_next_in_service(struct bfq_sched_data *sd, !bfq_gt(new_entity->start, st->vtime) && bfq_gt(next_in_service->finish, - new_entity->finish)) - || - new_entity_class_idx < - bfq_class_idx(next_in_service); + new_entity->finish)); } if (replace_next) -- 2.10.0 ^ permalink raw reply related [flat|nested] 16+ messages in thread
* [PATCH BUGFIX/IMPROVEMENT V2 3/3] block, bfq: guarantee update_next_in_service always returns an eligible entity 2017-08-31 6:46 [PATCH BUGFIX/IMPROVEMENT V2 0/3] three bfq fixes restoring service guarantees with random sync writes in bg Paolo Valente 2017-08-31 6:46 ` [PATCH BUGFIX/IMPROVEMENT V2 1/3] block, bfq: make lookup_next_entity push up vtime on expirations Paolo Valente 2017-08-31 6:46 ` [PATCH BUGFIX/IMPROVEMENT V2 2/3] block, bfq: remove direct switch to an entity in higher class Paolo Valente @ 2017-08-31 6:46 ` Paolo Valente 2017-08-31 14:21 ` [PATCH BUGFIX/IMPROVEMENT V2 0/3] three bfq fixes restoring service guarantees with random sync writes in bg Jens Axboe 2017-08-31 14:42 ` Mel Gorman 4 siblings, 0 replies; 16+ messages in thread From: Paolo Valente @ 2017-08-31 6:46 UTC (permalink / raw) To: Jens Axboe Cc: linux-block, linux-kernel, ulf.hansson, broonie, mgorman, lee.tibbert, oleksandr, Paolo Valente If the function bfq_update_next_in_service is invoked as a consequence of the activation or requeueing of an entity, say E, then it doesn't invoke bfq_lookup_next_entity to get the next-in-service entity. In contrast, it follows a shorter path: if E happens to be eligible (see commit "bfq-sq-mq: make lookup_next_entity push up vtime on expirations" for details on eligibility) and to have a lower virtual finish time than the current candidate as next-in-service entity, then E directly becomes the next-in-service entity. Unfortunately, there is a corner case for which this shorter path makes bfq_update_next_in_service choose a non eligible entity: it occurs if both E and the current next-in-service entity happen to be non eligible when bfq_update_next_in_service is invoked. In this case, E is not set as next-in-service, and, since bfq_lookup_next_entity is not invoked, the state of the parent entity is not updated so as to end up with an eligible entity as the proper next-in-service entity. In this respect, next-in-service is actually allowed to be non eligible while some queue is in service: since no system-virtual-time push-up can be performed in that case (see again commit "bfq-sq-mq: make lookup_next_entity push up vtime on expirations" for details), next-in-service is chosen, speculatively, as a function of the possible value that the system virtual time may get after a push up. But the correctness of the schedule breaks if next-in-service is still a non eligible entity when it is time to set in service the next entity. Unfortunately, this may happen in the above corner case. This commit fixes this problem by making bfq_update_next_in_service invoke bfq_lookup_next_entity not only if the above shorter path cannot be taken, but also if the shorter path is taken but fails to yield an eligible next-in-service entity. Signed-off-by: Paolo Valente <paolo.valente@linaro.org> Tested-by: Lee Tibbert <lee.tibbert@gmail.com> Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name> --- block/bfq-wf2q.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-) diff --git a/block/bfq-wf2q.c b/block/bfq-wf2q.c index eeaf326..add54f2 100644 --- a/block/bfq-wf2q.c +++ b/block/bfq-wf2q.c @@ -80,6 +80,7 @@ static bool bfq_update_next_in_service(struct bfq_sched_data *sd, { struct bfq_entity *next_in_service = sd->next_in_service; bool parent_sched_may_change = false; + bool change_without_lookup = false; /* * If this update is triggered by the activation, requeueing @@ -99,7 +100,7 @@ static bool bfq_update_next_in_service(struct bfq_sched_data *sd, * set to true, and left as true if * sd->next_in_service is NULL. */ - bool replace_next = true; + change_without_lookup = true; /* * If there is already a next_in_service candidate @@ -112,7 +113,7 @@ static bool bfq_update_next_in_service(struct bfq_sched_data *sd, struct bfq_service_tree *st = sd->service_tree + new_entity_class_idx; - replace_next = + change_without_lookup = (new_entity_class_idx == bfq_class_idx(next_in_service) && @@ -122,15 +123,16 @@ static bool bfq_update_next_in_service(struct bfq_sched_data *sd, new_entity->finish)); } - if (replace_next) + if (change_without_lookup) next_in_service = new_entity; - } else /* invoked because of a deactivation: lookup needed */ + } + + if (!change_without_lookup) /* lookup needed */ next_in_service = bfq_lookup_next_entity(sd, expiration); - if (next_in_service) { + if (next_in_service) parent_sched_may_change = !sd->next_in_service || bfq_update_parent_budget(next_in_service); - } sd->next_in_service = next_in_service; -- 2.10.0 ^ permalink raw reply related [flat|nested] 16+ messages in thread
* Re: [PATCH BUGFIX/IMPROVEMENT V2 0/3] three bfq fixes restoring service guarantees with random sync writes in bg 2017-08-31 6:46 [PATCH BUGFIX/IMPROVEMENT V2 0/3] three bfq fixes restoring service guarantees with random sync writes in bg Paolo Valente ` (2 preceding siblings ...) 2017-08-31 6:46 ` [PATCH BUGFIX/IMPROVEMENT V2 3/3] block, bfq: guarantee update_next_in_service always returns an eligible entity Paolo Valente @ 2017-08-31 14:21 ` Jens Axboe 2017-08-31 14:42 ` Mel Gorman 4 siblings, 0 replies; 16+ messages in thread From: Jens Axboe @ 2017-08-31 14:21 UTC (permalink / raw) To: Paolo Valente Cc: linux-block, linux-kernel, ulf.hansson, broonie, mgorman, lee.tibbert, oleksandr On 08/31/2017 12:46 AM, Paolo Valente wrote: > [SECOND TAKE, with just the name of one of the tester fixed] > > Hi, > while testing the read-write unfairness issues reported by Mel, I > found BFQ failing to guarantee good responsiveness against heavy > random sync writes in the background, i.e., multiple writers doing > random writes and systematic fdatasync [1]. The failure was caused by > three related bugs, because of which BFQ failed to guarantee to > high-weight processes the expected fraction of the throughput. > > The three patches in this series fix these bugs. These fixes restore > the usual BFQ service guarantees (and thus optimal responsiveness > too), against the above background workload and, probably, against > other similar workloads. Applied for 4.14, thanks Paolo. -- Jens Axboe ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH BUGFIX/IMPROVEMENT V2 0/3] three bfq fixes restoring service guarantees with random sync writes in bg 2017-08-31 6:46 [PATCH BUGFIX/IMPROVEMENT V2 0/3] three bfq fixes restoring service guarantees with random sync writes in bg Paolo Valente ` (3 preceding siblings ...) 2017-08-31 14:21 ` [PATCH BUGFIX/IMPROVEMENT V2 0/3] three bfq fixes restoring service guarantees with random sync writes in bg Jens Axboe @ 2017-08-31 14:42 ` Mel Gorman 2017-08-31 17:06 ` Mike Galbraith 2017-09-04 8:14 ` Mel Gorman 4 siblings, 2 replies; 16+ messages in thread From: Mel Gorman @ 2017-08-31 14:42 UTC (permalink / raw) To: Paolo Valente Cc: Jens Axboe, linux-block, linux-kernel, ulf.hansson, broonie, lee.tibbert, oleksandr On Thu, Aug 31, 2017 at 08:46:28AM +0200, Paolo Valente wrote: > [SECOND TAKE, with just the name of one of the tester fixed] > > Hi, > while testing the read-write unfairness issues reported by Mel, I > found BFQ failing to guarantee good responsiveness against heavy > random sync writes in the background, i.e., multiple writers doing > random writes and systematic fdatasync [1]. The failure was caused by > three related bugs, because of which BFQ failed to guarantee to > high-weight processes the expected fraction of the throughput. > Queued on top of Ming's most recent series even though that's still a work in progress. I should know in a few days how things stand. -- Mel Gorman SUSE Labs ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH BUGFIX/IMPROVEMENT V2 0/3] three bfq fixes restoring service guarantees with random sync writes in bg 2017-08-31 14:42 ` Mel Gorman @ 2017-08-31 17:06 ` Mike Galbraith 2017-09-04 8:14 ` Mel Gorman 1 sibling, 0 replies; 16+ messages in thread From: Mike Galbraith @ 2017-08-31 17:06 UTC (permalink / raw) To: Mel Gorman, Paolo Valente Cc: Jens Axboe, linux-block, linux-kernel, ulf.hansson, broonie, lee.tibbert, oleksandr On Thu, 2017-08-31 at 15:42 +0100, Mel Gorman wrote: > On Thu, Aug 31, 2017 at 08:46:28AM +0200, Paolo Valente wrote: > > [SECOND TAKE, with just the name of one of the tester fixed] > >=20 > > Hi, > > while testing the read-write unfairness issues reported by Mel, I > > found BFQ failing to guarantee good responsiveness against heavy > > random sync writes in the background, i.e., multiple writers doing > > random writes and systematic fdatasync [1]. The failure was caused by > > three related bugs, because of which BFQ failed to guarantee to > > high-weight processes the expected fraction of the throughput. > >=20 >=20 > Queued on top of Ming's most recent series even though that's still a wor= k > in progress. I should know in a few days how things stand. It seems to have cured an interactivity issue I regularly meet during kbuild final link/depmod phase of fat kernel kbuild, especially bad with evolution mail usage during that on spinning rust. =C2=A0Can't really say for sure given this is not based on measurement. -Mike=C2=A0 ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH BUGFIX/IMPROVEMENT V2 0/3] three bfq fixes restoring service guarantees with random sync writes in bg @ 2017-08-31 17:06 ` Mike Galbraith 0 siblings, 0 replies; 16+ messages in thread From: Mike Galbraith @ 2017-08-31 17:06 UTC (permalink / raw) To: Mel Gorman, Paolo Valente Cc: Jens Axboe, linux-block, linux-kernel, ulf.hansson, broonie, lee.tibbert, oleksandr On Thu, 2017-08-31 at 15:42 +0100, Mel Gorman wrote: > On Thu, Aug 31, 2017 at 08:46:28AM +0200, Paolo Valente wrote: > > [SECOND TAKE, with just the name of one of the tester fixed] > > > > Hi, > > while testing the read-write unfairness issues reported by Mel, I > > found BFQ failing to guarantee good responsiveness against heavy > > random sync writes in the background, i.e., multiple writers doing > > random writes and systematic fdatasync [1]. The failure was caused by > > three related bugs, because of which BFQ failed to guarantee to > > high-weight processes the expected fraction of the throughput. > > > > Queued on top of Ming's most recent series even though that's still a work > in progress. I should know in a few days how things stand. It seems to have cured an interactivity issue I regularly meet during kbuild final link/depmod phase of fat kernel kbuild, especially bad with evolution mail usage during that on spinning rust. Can't really say for sure given this is not based on measurement. -Mike ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH BUGFIX/IMPROVEMENT V2 0/3] three bfq fixes restoring service guarantees with random sync writes in bg 2017-08-31 17:06 ` Mike Galbraith @ 2017-08-31 17:12 ` Paolo Valente -1 siblings, 0 replies; 16+ messages in thread From: Paolo Valente @ 2017-08-31 17:12 UTC (permalink / raw) To: Mike Galbraith Cc: Mel Gorman, Jens Axboe, linux-block, Linux Kernel Mailing List, Ulf Hansson, broonie, lee.tibbert, oleksandr > Il giorno 31 ago 2017, alle ore 19:06, Mike Galbraith <efault@gmx.de> = ha scritto: >=20 > On Thu, 2017-08-31 at 15:42 +0100, Mel Gorman wrote: >> On Thu, Aug 31, 2017 at 08:46:28AM +0200, Paolo Valente wrote: >>> [SECOND TAKE, with just the name of one of the tester fixed] >>>=20 >>> Hi, >>> while testing the read-write unfairness issues reported by Mel, I >>> found BFQ failing to guarantee good responsiveness against heavy >>> random sync writes in the background, i.e., multiple writers doing >>> random writes and systematic fdatasync [1]. The failure was caused = by >>> three related bugs, because of which BFQ failed to guarantee to >>> high-weight processes the expected fraction of the throughput. >>>=20 >>=20 >> Queued on top of Ming's most recent series even though that's still a = work >> in progress. I should know in a few days how things stand. >=20 > It seems to have cured an interactivity issue I regularly meet during > kbuild final link/depmod phase of fat kernel kbuild, especially bad > with evolution mail usage during that on spinning rust. Can't really > say for sure given this is not based on measurement. >=20 Great! Actually, when I found these bugs, I thought also about the issues you told me you experienced with updatedb running. But then I forgot to tell you that these fixes might help. Thanks, Paolo > -Mike=20 ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH BUGFIX/IMPROVEMENT V2 0/3] three bfq fixes restoring service guarantees with random sync writes in bg @ 2017-08-31 17:12 ` Paolo Valente 0 siblings, 0 replies; 16+ messages in thread From: Paolo Valente @ 2017-08-31 17:12 UTC (permalink / raw) To: Mike Galbraith Cc: Mel Gorman, Jens Axboe, linux-block, Linux Kernel Mailing List, Ulf Hansson, broonie, lee.tibbert, oleksandr > Il giorno 31 ago 2017, alle ore 19:06, Mike Galbraith <efault@gmx.de> ha scritto: > > On Thu, 2017-08-31 at 15:42 +0100, Mel Gorman wrote: >> On Thu, Aug 31, 2017 at 08:46:28AM +0200, Paolo Valente wrote: >>> [SECOND TAKE, with just the name of one of the tester fixed] >>> >>> Hi, >>> while testing the read-write unfairness issues reported by Mel, I >>> found BFQ failing to guarantee good responsiveness against heavy >>> random sync writes in the background, i.e., multiple writers doing >>> random writes and systematic fdatasync [1]. The failure was caused by >>> three related bugs, because of which BFQ failed to guarantee to >>> high-weight processes the expected fraction of the throughput. >>> >> >> Queued on top of Ming's most recent series even though that's still a work >> in progress. I should know in a few days how things stand. > > It seems to have cured an interactivity issue I regularly meet during > kbuild final link/depmod phase of fat kernel kbuild, especially bad > with evolution mail usage during that on spinning rust. Can't really > say for sure given this is not based on measurement. > Great! Actually, when I found these bugs, I thought also about the issues you told me you experienced with updatedb running. But then I forgot to tell you that these fixes might help. Thanks, Paolo > -Mike ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH BUGFIX/IMPROVEMENT V2 0/3] three bfq fixes restoring service guarantees with random sync writes in bg 2017-08-31 17:12 ` Paolo Valente @ 2017-08-31 17:31 ` Mike Galbraith -1 siblings, 0 replies; 16+ messages in thread From: Mike Galbraith @ 2017-08-31 17:31 UTC (permalink / raw) To: Paolo Valente Cc: Mel Gorman, Jens Axboe, linux-block, Linux Kernel Mailing List, Ulf Hansson, broonie, lee.tibbert, oleksandr On Thu, 2017-08-31 at 19:12 +0200, Paolo Valente wrote: > > Il giorno 31 ago 2017, alle ore 19:06, Mike Galbraith <efault@gmx.de> h= a scritto: > >=20 > > On Thu, 2017-08-31 at 15:42 +0100, Mel Gorman wrote: > >> On Thu, Aug 31, 2017 at 08:46:28AM +0200, Paolo Valente wrote: > >>> [SECOND TAKE, with just the name of one of the tester fixed] > >>>=20 > >>> Hi, > >>> while testing the read-write unfairness issues reported by Mel, I > >>> found BFQ failing to guarantee good responsiveness against heavy > >>> random sync writes in the background, i.e., multiple writers doing > >>> random writes and systematic fdatasync [1]. The failure was caused by > >>> three related bugs, because of which BFQ failed to guarantee to > >>> high-weight processes the expected fraction of the throughput. > >>>=20 > >>=20 > >> Queued on top of Ming's most recent series even though that's still a = work > >> in progress. I should know in a few days how things stand. > >=20 > > It seems to have cured an interactivity issue I regularly meet during > > kbuild final link/depmod phase of fat kernel kbuild, especially bad > > with evolution mail usage during that on spinning rust. Can't really > > say for sure given this is not based on measurement. > > >=20 >=20 > Great! Actually, when I found these bugs, I thought also about the > issues you told me you experienced with updatedb running. But then I > forgot to tell you that these fixes might help. I'm going to actively test that, because that is every bit as infuriating as the evolution thing, only updatedb is nukable. =C2=A0In fact= , it infuriated me to the point that it no longer has a crontab entry, runs only when I decide to run it. =C2=A0At this point, I'll be pretty surprised if that rotten <naughty words> is still alive. -Mike ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH BUGFIX/IMPROVEMENT V2 0/3] three bfq fixes restoring service guarantees with random sync writes in bg @ 2017-08-31 17:31 ` Mike Galbraith 0 siblings, 0 replies; 16+ messages in thread From: Mike Galbraith @ 2017-08-31 17:31 UTC (permalink / raw) To: Paolo Valente Cc: Mel Gorman, Jens Axboe, linux-block, Linux Kernel Mailing List, Ulf Hansson, broonie, lee.tibbert, oleksandr On Thu, 2017-08-31 at 19:12 +0200, Paolo Valente wrote: > > Il giorno 31 ago 2017, alle ore 19:06, Mike Galbraith <efault@gmx.de> ha scritto: > > > > On Thu, 2017-08-31 at 15:42 +0100, Mel Gorman wrote: > >> On Thu, Aug 31, 2017 at 08:46:28AM +0200, Paolo Valente wrote: > >>> [SECOND TAKE, with just the name of one of the tester fixed] > >>> > >>> Hi, > >>> while testing the read-write unfairness issues reported by Mel, I > >>> found BFQ failing to guarantee good responsiveness against heavy > >>> random sync writes in the background, i.e., multiple writers doing > >>> random writes and systematic fdatasync [1]. The failure was caused by > >>> three related bugs, because of which BFQ failed to guarantee to > >>> high-weight processes the expected fraction of the throughput. > >>> > >> > >> Queued on top of Ming's most recent series even though that's still a work > >> in progress. I should know in a few days how things stand. > > > > It seems to have cured an interactivity issue I regularly meet during > > kbuild final link/depmod phase of fat kernel kbuild, especially bad > > with evolution mail usage during that on spinning rust. Can't really > > say for sure given this is not based on measurement. > > > > > Great! Actually, when I found these bugs, I thought also about the > issues you told me you experienced with updatedb running. But then I > forgot to tell you that these fixes might help. I'm going to actively test that, because that is every bit as infuriating as the evolution thing, only updatedb is nukable. In fact, it infuriated me to the point that it no longer has a crontab entry, runs only when I decide to run it. At this point, I'll be pretty surprised if that rotten <naughty words> is still alive. -Mike ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH BUGFIX/IMPROVEMENT V2 0/3] three bfq fixes restoring service guarantees with random sync writes in bg 2017-08-31 14:42 ` Mel Gorman 2017-08-31 17:06 ` Mike Galbraith @ 2017-09-04 8:14 ` Mel Gorman 2017-09-04 8:55 ` Paolo Valente 2017-09-04 9:07 ` Ming Lei 1 sibling, 2 replies; 16+ messages in thread From: Mel Gorman @ 2017-09-04 8:14 UTC (permalink / raw) To: Paolo Valente Cc: Jens Axboe, linux-block, linux-kernel, ulf.hansson, broonie, lee.tibbert, oleksandr On Thu, Aug 31, 2017 at 03:42:57PM +0100, Mel Gorman wrote: > On Thu, Aug 31, 2017 at 08:46:28AM +0200, Paolo Valente wrote: > > [SECOND TAKE, with just the name of one of the tester fixed] > > > > Hi, > > while testing the read-write unfairness issues reported by Mel, I > > found BFQ failing to guarantee good responsiveness against heavy > > random sync writes in the background, i.e., multiple writers doing > > random writes and systematic fdatasync [1]. The failure was caused by > > three related bugs, because of which BFQ failed to guarantee to > > high-weight processes the expected fraction of the throughput. > > > > Queued on top of Ming's most recent series even though that's still a work > in progress. I should know in a few days how things stand. > The problems with parallel heavy writers seem to have disappeared with this series. There are still revisions taking place on Ming's to overall setting of legacy vs mq is still a work in progress but this series looks good. Thanks. -- Mel Gorman SUSE Labs ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH BUGFIX/IMPROVEMENT V2 0/3] three bfq fixes restoring service guarantees with random sync writes in bg 2017-09-04 8:14 ` Mel Gorman @ 2017-09-04 8:55 ` Paolo Valente 2017-09-04 9:07 ` Ming Lei 1 sibling, 0 replies; 16+ messages in thread From: Paolo Valente @ 2017-09-04 8:55 UTC (permalink / raw) To: Mel Gorman Cc: Jens Axboe, linux-block, Linux Kernel Mailing List, Ulf Hansson, broonie, lee.tibbert, oleksandr > Il giorno 04 set 2017, alle ore 10:14, Mel Gorman = <mgorman@techsingularity.net> ha scritto: >=20 > On Thu, Aug 31, 2017 at 03:42:57PM +0100, Mel Gorman wrote: >> On Thu, Aug 31, 2017 at 08:46:28AM +0200, Paolo Valente wrote: >>> [SECOND TAKE, with just the name of one of the tester fixed] >>>=20 >>> Hi, >>> while testing the read-write unfairness issues reported by Mel, I >>> found BFQ failing to guarantee good responsiveness against heavy >>> random sync writes in the background, i.e., multiple writers doing >>> random writes and systematic fdatasync [1]. The failure was caused = by >>> three related bugs, because of which BFQ failed to guarantee to >>> high-weight processes the expected fraction of the throughput. >>>=20 >>=20 >> Queued on top of Ming's most recent series even though that's still a = work >> in progress. I should know in a few days how things stand. >>=20 >=20 > The problems with parallel heavy writers seem to have disappeared with = this > series. There are still revisions taking place on Ming's to overall = setting > of legacy vs mq is still a work in progress but this series looks = good. >=20 Great news! Thanks for testing, Paolo > Thanks. >=20 > --=20 > Mel Gorman > SUSE Labs ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH BUGFIX/IMPROVEMENT V2 0/3] three bfq fixes restoring service guarantees with random sync writes in bg @ 2017-09-04 8:55 ` Paolo Valente 0 siblings, 0 replies; 16+ messages in thread From: Paolo Valente @ 2017-09-04 8:55 UTC (permalink / raw) To: Mel Gorman Cc: Jens Axboe, linux-block, Linux Kernel Mailing List, Ulf Hansson, broonie, lee.tibbert, oleksandr > Il giorno 04 set 2017, alle ore 10:14, Mel Gorman <mgorman@techsingularity.net> ha scritto: > > On Thu, Aug 31, 2017 at 03:42:57PM +0100, Mel Gorman wrote: >> On Thu, Aug 31, 2017 at 08:46:28AM +0200, Paolo Valente wrote: >>> [SECOND TAKE, with just the name of one of the tester fixed] >>> >>> Hi, >>> while testing the read-write unfairness issues reported by Mel, I >>> found BFQ failing to guarantee good responsiveness against heavy >>> random sync writes in the background, i.e., multiple writers doing >>> random writes and systematic fdatasync [1]. The failure was caused by >>> three related bugs, because of which BFQ failed to guarantee to >>> high-weight processes the expected fraction of the throughput. >>> >> >> Queued on top of Ming's most recent series even though that's still a work >> in progress. I should know in a few days how things stand. >> > > The problems with parallel heavy writers seem to have disappeared with this > series. There are still revisions taking place on Ming's to overall setting > of legacy vs mq is still a work in progress but this series looks good. > Great news! Thanks for testing, Paolo > Thanks. > > -- > Mel Gorman > SUSE Labs ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: [PATCH BUGFIX/IMPROVEMENT V2 0/3] three bfq fixes restoring service guarantees with random sync writes in bg 2017-09-04 8:14 ` Mel Gorman 2017-09-04 8:55 ` Paolo Valente @ 2017-09-04 9:07 ` Ming Lei 1 sibling, 0 replies; 16+ messages in thread From: Ming Lei @ 2017-09-04 9:07 UTC (permalink / raw) To: Mel Gorman Cc: Paolo Valente, Jens Axboe, linux-block, Linux Kernel Mailing List, ulf.hansson, Mark Brown, lee.tibbert, oleksandr On Mon, Sep 4, 2017 at 4:14 PM, Mel Gorman <mgorman@techsingularity.net> wrote: > On Thu, Aug 31, 2017 at 03:42:57PM +0100, Mel Gorman wrote: >> On Thu, Aug 31, 2017 at 08:46:28AM +0200, Paolo Valente wrote: >> > [SECOND TAKE, with just the name of one of the tester fixed] >> > >> > Hi, >> > while testing the read-write unfairness issues reported by Mel, I >> > found BFQ failing to guarantee good responsiveness against heavy >> > random sync writes in the background, i.e., multiple writers doing >> > random writes and systematic fdatasync [1]. The failure was caused by >> > three related bugs, because of which BFQ failed to guarantee to >> > high-weight processes the expected fraction of the throughput. >> > >> >> Queued on top of Ming's most recent series even though that's still a work >> in progress. I should know in a few days how things stand. >> > > The problems with parallel heavy writers seem to have disappeared with this > series. There are still revisions taking place on Ming's to overall setting > of legacy vs mq is still a work in progress but this series looks good. Hi Mel and Paolo, BTW, no actual functional change in V4. Also could you guys provide one tested-by since looks you are using it in your test? Thanks, Ming Lei ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2017-09-04 9:07 UTC | newest] Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-08-31 6:46 [PATCH BUGFIX/IMPROVEMENT V2 0/3] three bfq fixes restoring service guarantees with random sync writes in bg Paolo Valente 2017-08-31 6:46 ` [PATCH BUGFIX/IMPROVEMENT V2 1/3] block, bfq: make lookup_next_entity push up vtime on expirations Paolo Valente 2017-08-31 6:46 ` [PATCH BUGFIX/IMPROVEMENT V2 2/3] block, bfq: remove direct switch to an entity in higher class Paolo Valente 2017-08-31 6:46 ` [PATCH BUGFIX/IMPROVEMENT V2 3/3] block, bfq: guarantee update_next_in_service always returns an eligible entity Paolo Valente 2017-08-31 14:21 ` [PATCH BUGFIX/IMPROVEMENT V2 0/3] three bfq fixes restoring service guarantees with random sync writes in bg Jens Axboe 2017-08-31 14:42 ` Mel Gorman 2017-08-31 17:06 ` Mike Galbraith 2017-08-31 17:06 ` Mike Galbraith 2017-08-31 17:12 ` Paolo Valente 2017-08-31 17:12 ` Paolo Valente 2017-08-31 17:31 ` Mike Galbraith 2017-08-31 17:31 ` Mike Galbraith 2017-09-04 8:14 ` Mel Gorman 2017-09-04 8:55 ` Paolo Valente 2017-09-04 8:55 ` Paolo Valente 2017-09-04 9:07 ` Ming Lei
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.